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PREFACE 


As its title indicates, this volume contains a selection of Feynman’s important scientific 
papers together with short comments. Most of the papers contain pure research, but among 
them are scattered some articles that are largely pedagogical, such as published lectures that 
Feynman gave at advanced physics workshops and summer schools. As the editor I chose 
the papers and also provided the comments, except as indicated in the text. Such a selection 
cannot avoid arbitrariness, and I apologize to those who feel that their favorites may have 
been unjustly omitted. 

In the course of preparation, I have consulted some physicists, historians, and others, 
whom I would like to thank: Tian Yu Cao, Michael Cohen, Don Ellis, Joan Feynman, 
and Robert Michaelson. Carl Iddings and Frank Vernon, Jr. sent me valuable information 
concerning their collaborations with Feynman which are included in Part III. Danny Hillis 
helped to orient me with regard to the papers on computers. I am especially indebted 
to Alexander Fetter for writing the commentary which appears with the liquid helium pa- 
pers. I am also greatly obliged for the hospitality of Judith Goodstein and the staff at the 
Caltech Archives who assisted me when I was reading Feynman’s unpublished documents, 
and provided the bibliography of Feynman’s writings at the end of this volume. Finally, 
I wish to acknowledge the great help of the editors at World Scientific. 
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RICHARD PHILLIPS FEYNMAN 


One of the century’s outstanding scientists, Richard P. Feynman was born on 11 May 1918 
at Far Rockaway, Queens, a part of New York City, where he received his education through 
high school. He then attended the Massachusetts Institute of Technology; exhibiting superior 
mathematical skills, he took advanced physics courses and earned a B.S. degree. According 
to S.S. Schweber, “By the time Feynman finished his undergraduate studies at MIT in 1939 
he had mastered many of the fields of theoretical physics.” Feynman went on to do graduate 
school at Princeton University, where his principal mentor was Professor John A. Wheeler, 
under whose sponsorship he wrote a doctoral thesis entitled “The Principle of Least Action 
in Quantum Mechanics”; he was awarded a Ph.D. in 1942. 

From Princeton Feynman moved to Los Alamos, New Mexico, to work on the wartime 
development of the nuclear bomb. There he became an important member, and the youngest 
group leader, of the theoretical physics division headed by Hans Bethe of Cornell University. 
After the war Bethe persuaded Feynman to accept a faculty position at Cornell, and it was 
in Ithaca, New York, that Feynman developed his diagrammatic methods and accomplished 
the work that earned him the Nobel Prize in Physics. In 1950 he became a professor at 
the California Institute of Technology, being appointed the Richard Chase Tolman Professor 
of Physics in 1959. He remained in Pasadena until his death at the age of sixty-nine, on 
15 February 1988. 

Feynman had three marriages. The first was in 1942 to Arline Greenbaum, who was 
incurably ill at the time of the marriage and died in 1945. His second marriage, to Mary 
Louise Bell in 1952, ended in divorce. In 1960 he married Gweneth Howarth, with whom he 
had a son, Carl, and an adopted daughter, Michelle. 

Early in his career, Feynman became well-known to the world physics community for his 
brilliant research, his outstanding teaching, and his flamboyant personality. After sharing 
the Nobel Prize in 1965 for his work on renormalized quantum electrodynamics (QED) with 
Julian Seymour Schwinger and Sin-itiro Tomanaga, he also became an important establish- 
ment figure, receiving many invitations to lecture throughout the world (most of which he 
politely declined). However, he became a notable public figure only near the end of his life, 
when, in January 1986, President Reagan appointed him to a presidential commission to 
investigate the cause of the explosion of the space shuttle Challenger. Appearing in the 
televised hearing of the investigative committee, Feynman performed a simple experiment 
with a glass of ice water and a piece of the shuttle’s failed O ring, in order to demonstrate 
the immediate cause of the disaster. His presentation made a deep impression on millions of 
television viewers, and, coupled with two best-selling books of his reminiscences ([112] and 
[121]*), other television appearances, and adulation in the press, Feynman became some- 
thing of a cult figure, especially after his death. A few of the biographies and other books 
dealing with him and his work are listed after this account. 

Besides the Nobel Prize, Feynman received the Albert Einstein Award (1954), the Ernest 
Orlando Lawrence Award for Physics (1962), the Oersted Medal (1972), and the Niels Bohr 
International Gold Medal (1973). He was a Member of the Brazilian Academy of Sciences 
and a Foreign Fellow of the Royal Society of London (1965). He was elected to the National 


*Numbers in square brackets refer to the bibliography at the end of this volume. 
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Academy of Sciences (USA), but later resigned, and often declared that he was totally 
uninterested in receiving “honors.” 
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I. Quantum Chemistry 


In his final year as an undergraduate at the Massachusetts Institute of Technology, Feynman 
published with one of his teachers, Manuel S. Vallarta, a Letter to the Editor of the Physical 
Review on cosmic rays [1]. He also completed a senior dissertation under John C. Slater 
entitled “Forces and Stresses in Molecules” and published a shortened version, “Forces in 
Molecules,” as an article in the Physical Review. The latter contained a result — a general 
quantum-mechanical theorem — that has played an important role in theoretical chemistry 
and condensed matter physics and is frequently cited as the Hellmann—Feynman theorem.! 
According to Feynman’s abstract, “The force on a nucleus in an atomic system is shown to be 
just the classical electrostatic force that would be exerted on this nucleus by other nuclei and 
by the electrons’ charge distribution.” Quantum mechanics is used to calculate the charge 
distribution as the absolute square of the Schrédinger wave function. The importance of the 
forces on the atomic nuclei for molecular geometry, the theory of chemical binding, and for 
crystal structure is evident. 


Selected Paper 
[2] Forces in molecules, Phys. Rev. 56 (1939): 340-343. 


1The German quantum chemist H. Hellmann published the theorem in a textbook, Einfiihring in die Quan- 
tenchemie (1937, Franz Deuticke, Leipzig), but the reference was unknown to Feynman and Slater. For the 
history of the H-F theorem see J.I. Musher, Am. J. Phys. 34 (1966): 267-268, and J.C. Slater, Solid State 
and Molecular Theory (1975, Wiley, New York), pp. 193-199. For applications see B.M. Deb (ed.), The Force 
Concept in Chemistry (1981, Van Nostrand Reinhold, New York). 
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Forces in Molecules 


R. P. FEYNMAN 
Massachusetts Institute of Technology, Cambridge, Massachusetis 


(Received June 22, 1939) 


Formulas have been developed to calculate the forces in a molecular system directly, rather 
than indirectly through the agency of energy. This permits an independent calculation of the 
slope of the curves of energy vs. position of the nuclei, and may thus increase the accuracy, or 
decrease the labor involved in the calculation of these curves. The force on a nucleus in an 
atomic system is shown to be just the classical electrostatic force that would be exerted on this 
nucleus by other nuclei and by the electrons’ charge distribution. Qualitative implications of 


this are discussed. 


ANY of the problems of molecular structure 
are concerned essentially with forces. The 
stiffness of valence bonds, the distortions in 
geometry due to the various repulsions and 
attractions between atoms, the tendency of 
valence bonds to occur at certain definite angles 
with each other, are some examples of the kind 
of problem in which the idea of force is para- 
mount. 

Usually these problems have been considered 
through the agency of energy, and its changes 
with changing configuration of the molecule. 
The reason for this indirect attack through 
energy, rather than the more qualitatively illumi- 
nating one, by considerations of force, is perhaps 
twofold. First it is probably thought that force 
is a quantity that is not easily described or calcu- 
lated by wave mechanics, while energy is, and 
second, the first molecular problem to be solved 
is the analysis of band spectra, strictly a problem 
of energy as such. It is the purpose of this paper 
to show that forces are almost as easy to calculate 
as energies are, and that the equations are quite 
as easy to interpret. In fact, all forces on atomic 
nuclei in a molecule can be considered as purely 
classical attractions involving Coulomb's law. 
The electron cloud distribution is prevented 
from collapsing by obeying Schrédinger’s equa- 
tion. In these considerations the nuclei are 
considered as mass points held fixed in position. 

A usual method of calculating interatomic 
forces runs somewhat as follows. 

For a given, fixed configuration of the nuclei, 
the energy of the entire system (electrons and 
nuclei) is calculated. This is done by the variation 
method or other perturbation schemes. This 


entire process is repeated for a new nuclear 
position, and the new value of energy calculated. 
Proceeding in this way, a plot of energy vs. 
position is obtained. The force on a nucleus is 
of course the slope of this curve. 

The following method is one designed to 
obtain the forces at a given configuration, when 
only the configuration is known. It does not 
require the calculations at neighboring configura- 
tions. That is, it permits a calculation of the 
slope of the energy curve as well as its value, 
for any particular configuration. It is to be 
emphasized that this allows a considerable saving 
of labor of calculations. To obtain force under the 
usual scheme the energy needs to be calculated 
for two or more different and neighboring con- 
figurations. Each point requires the calculation 
of the wave functions for the entire system. 
In this new method, only one configuration, the 
one in question, need have its wave functions 
computed in detail. Thus the labor is consider- 
ably reduced. Because it permits one to get an 
independent value of the slope of the energy 
curve, the method might increase the accuracy 
in the calculation of these curves, being especially 
helpful in locating the normal separation, or 
position of zero force. 

In the following it is to be understood that the 
nuclei of the atoms in the molecule, or other 
atomic system, are to be held fixed in position, 
as point charges, and the force required to be 
applied to the nuclei to hold them is to be 
calculated. This will lead to two possible defini- 
tions of force in the nonsteady state, for then 
the energy is not a definite quantity, and the 
slope of the energy curve shares this indefinite- 
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FORCES IN MOLECULES 


ness. It will be shown that these two possible 
definitions are exactly equivalent in the steady- 
state case, and, of course, no ambiguity should 
arise there. 

Let } be one of any number of parameters 
which specify nuclear positions. For example, 
> might be the x component of the position of 
one of the nuclei. A force f, is to be associated 
with \ in such a way that fid\ measures the 
virtual work done in displacing the nuclei 
through dd. This will define the force only when 
the molecule is in a steady state, of energy U, 
for then we can say fy= —0U/@x. In the non- 
steady-state case we have no sure guide to a 
definition of force. For example, if U= fy*Hydv 
be the average energy of the system of wave 
function Y and Hamiltonian H, we might define 


(1) 


Or again, we might take f, to be the average of 
—0H/dd or 


h=-(— ) = — [van (2) 


We shall prove that under steady-state con- 
ditions, both these definitions of force become 
exactly equivalent, and equal to —dU/d, the 
slope of the energy curve. Since (2) is simpler 
than (1) we can define force by (2) in general. 
In particular, it gives a simple expression for the 
slope of the energy curve. 

Thus we shall prove, when Hy=Uy and 
JS W*du=1 that, 


fy’ = —8(U)/an’. 


aU Pel 
—= { y—vav 
On CN 
Now 
U= f y*Hydv, 
whence, 


oy* oy 
= fv det f —_Hydo-+ f yAH—do. 
On On 
Since H is a self-adjoint operator, 


f tHe 2 f= Hy *do. 
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write, 


Uy* so that we can 


=f vd uf w yao uf ytd 


These last two terms cancel each other since 
their sum is, 


re] 0 
ye fvrvao= U—(1) =0. 
Or DN 


Whence 


a yd 


in the steady state. This much is true, regardless 
of the nature of H, (whether for spin, or nuclear 
forces, etc.). In the special case of atomic systems 
when H=T+V where T is the kinetic energy 
operator, and V the potential, since dH/dd 
=0V/0d we can write 


aU 
a amram A pe (3) 
an an 


The actual calculation of forces in a real 
molecule by means of this theorem is not im- 
practical. The /y*y(aV/dd)dv is not too differ- 
ent from /y*yVdv, which must be calculated if 
the energy is to be found at all in the variational 
method. Although the theorem (3) is the most 
practical for actual calculations, it can be 
modified to get a clearer qualitative picture of 
what it means. Suppose, for example, the system 
for which y is the wave function contains several 
nuclei, and let the coordinates of one of these 
nuclei, a, be X¢, Y*, Z* or X,* where p=1, 2, 3, 
mean X, Y, Z. If we take our \ parameter to be 
one of these coordinates, the resultant force on 
the nucleus & in the » direction will be given 
directly by 


= f W*(aV/aX,#)dv 


from (3). 

Now V is made up of three parts, the inter- 
action of all nuclei with each other (Vas), of each 
nucleus with an electron (Vg,), and of each 
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electron with every other (V,;); or 
V=> Vap+ Va:t> Vij. 
a,B B,?@ a7 


Suppose x,’ are the coordinates of electron i, 
and as before X,% those of nucleus @ of charge ga. 
Then Vei=qee/Re:, where 


3 


ReP= (XP) 


bet 
So we see that 
Va: OV ax OV i; 
= — bp% and that =0. 
ax,e ax,! ax,e 
Then (3) leads to 
OVai OV ag 
fem + fwrD—ae-5 f “yrds 
i Ox," B OX, 
OVas fe] Vas 
=-f{> {fi rao fav = (4) 
i Oxy! 8 OX, 


since 0V4:/dx,' does not involve any electron 
coordinate except those of electroni. fis: --+dv 
means the integral over the coordinates of all 
electrons except those of electron 7. The last 
term has been reduced since 0V 4g/0x,% does not 
involve the electron coordinates, and is constant 
as far as integration over these coordinates goes. 
This term gives ordinary Coulomb electrostatic 
repulsion between the nuclei and need not be 
considered further. Now efi /yy*dv is just the 
charge density distribution p;(x) due to electron 1, 
where e is the charge on one electron. The 
electric field Z,*(x‘) at any point x* due to the 
nucleus @ is (1/e)@Vai/dx,*, so that (4) may be 
written 
Vas 


firm f (Loe) 1B .e(2)eo~T* 
c= lx aad . 
‘ “ai : OX 


80X,* 


The 3N space for N electrons has been reduced 
to a 3 space. This can be done since E,*(x*) 
depends only on x‘ and is the same function of x? 
no matter which i we pick. This implies the 
following conclusion : 

The force on any nucleus (considered fixed) in 
any system of nuclei and electrons is just the 
classical electrostatic attraction exerted on the 
nucleus in question by the other nuclei and by 
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the electron charge density distribution for all 
electrons, 


(x) = Lei(s). 


It is possible to simplify this still further. 
Suppose we construct an electric field vector F 
such that 


V:-F=—4p(x); VX F=0. 


Now from the derivation of £,* we know that it 
arises from the charge ga on nucleus a, so that 
V:£*=0 except at the charge a where its integral] 
equals ga. Further, 


OV ap 
-> = Jal 2 Ly? Jat a, 
-6 OX, % 8 
Then 
ap 
f= —— | (V-P)E,tdy— py; 
T B Oxy 


1 OV a6 
Sa: J. F,(V-E,*)dv-E 
an 


B OxXy~ 


= al_F, Jat *tgel LE,P let xe (5) 
the transformation of the integral being accom- 
plished by integrating by parts. Or finally, the 
force on a nucleus is the charge on that nucleus 
times the electric field there due to all the 
electrons, plus the fields from the other nuclei. 
This field is calculated classically from the 
charge distribution of each electron and from 
the nuclei. 

It now becomes quite clear why the strongest 
and most important attractive forces arise when 
there is a concentration of charge between two 
nuclei. The nuclei on each side of the concen- 
trated charge are each strongly attracted to it. 
Thus they are, in effect, attracted toward each 
other. In a Hz molecule, for example, the anti- 
symmetrical wave function, because it must be 
zero exactly between the two H atoms, cannot 
concentrate charge between them. The sym- 
metrical solution, however, can easily permit 
charge concentration between the nuclei, and 
hence it is only the solution which is sym- 
metrical that leads to strong attraction, and the 
formation of a molecule, as is well known. It is 
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clearly seen that concentrations of charge be- 
tween atoms lead to strong attractive forces, and 
hence, are properly called valence bonds. 

Van der Waals’ forces can also be interpreted 
as arising from charge distributions with higher 
concentration between the nuclei. The Schréd- 
inger perturbation theory for two interacting 
atoms at a separation R, large compared to the 
radii of the atoms, leads to the result that the 
charge distribution of each is distorted from 
central symmetry, a dipole moment of order 
1/R? being induced in each atom. The negative 
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charge distribution of each atom has its center of 
gravity moved slightly toward the other. It is 
not the interaction of these dipoles which leads 
to van der Waals’ force, but rather the attraction 
of each nucleus for the distorted charge dis- 
tribution of its own electrons that gives the 
attractive 1/R? force. 

The author wishes to express his gratitude to 
Professor J. C. Slater who, by his advice and 
helpful suggestions, aided greatly in this work. 
He would also like to thank Dr. W. C. Herring 
for the latter’s excellent criticisms. 


II. Classical and Quantum Electrodynamics 


While Feynman made many original and imaginative contributions to theoretical physics, 
it may well be that his place in the history of science will be largely based on his approach 
to renormalizing quantum electrodynamics (QED), and especially on the tools that he in- 
vented to accomplish that goal, such as path integrals, the operator calculus, and the famous 
Feynman diagrams. Eventually QED may be replaced by a finite theory, rather than the 
present divergent, though renormalizable, one. (QED is already incorporated in the unified 
electroweak theory, one of the two parts of the Standard Model.) Feynman himself never 
regarded renormalized QED as complete, frequently pointing out its limitations and sug- 
gesting that it was merely what we now call an “effective field theory.” But even if QED 
proves to be transitory, the theoretical methods that Feynman developed are permanently 
embedded in mathematical physics, and have been widely applied in areas far beyond their 
original domain. 

Of our nine selected papers that deal with electrodynamics, two are in the nature of 
reviews, one being his 1961 report to the Solvay Conference [45], included for its lively 
originality. The other is Feynman’s Nobel Lecture, which is placed first in this section on 
electrodynamics for a special reason. That is not because the Prize itself has great scientific 
significance. (He even thought of refusing it, had that been practical; its award to Feynman 
honors the Prize as much as its recipient.) Rather, paper [73] occupies the leading position 
here because it provides a far more valuable and eloquent commentary on this group of 
papers than could be produced in any other way. It outlines the significant steps, including 
the less successful ones, by which Feynman recognized and worked his way through the 
problem situation, from classical to quantum electrodynamics, to find the solution, and it 
tells about the physicists with whom he interacted. In its context as a Nobel Lecture it is a 
surprisingly “human” story; at the least, its style would surprise us if it were told by anyone 
other than Feynman. 

The separation of the various papers into neat groups is rather arbitrary, and I have 
chosen to place papers [7], [14], and {15], which also contain derivations of QED, in a separate 
Section III which emphasizes the methodological innovations, because they have a wide range 
of applications in other fields as well. This applies especially to [7], which describes the path 
integral method.! 


II.A Classical and Quantum Electrodynamics — The Space—Time View 


Selected Paper 
[73] The development of the space-time view of quantum electrodynamics. In: Les Priz 
Nobel 1965. Stockholm, 1965: Imprimerie Royale P.A. Norstedt & Soner: 172-191. 


1This is the subject of Feynman’s doctoral thesis, written at Princeton University, under the sponsorship of 
John Wheeler. We would have included this dissertation, but were unable to get permission to do so. 
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The development of the space-time view 
of quantum electrodynamics 


Nobel Lecture, December 11, 1965 


We have a habit in writing articles published in scientific journals to make the 
work as finished as possible, to cover all the tracks, to not worry about the 
blind alleys or to describe how you had the wrong idea first, and so on. So 
there isn’t any place to publish, in a dignified manner, what you actually did 
in order to get to do the work, although, there has been in these days, some 
interest in this kind of thing. Since winning the prize is a personal thing, 1 
thought I could be excused in this particular situation, if I were to talk per- 
sonally about my relationship to quantum electrodynamics, rather than to 
discuss the subject itself in a refined and finished fashion. Furthermore, since 
there are three people who have won the prize in physics, if they are all going 
to be talking about quantum electrodynamics itself, one might become bored 
with the subject. So, what I would like to tell you about today are the sequence 
of events, really the sequence of ideas, which occurred, and by which I finally 
came out the other end with an unsolved problem for which I ultimately 
received a prize. 

I realize that a truly scientific paper would be of greater value, but such a 
paper I could publish in regular journals. So, I shall use this Nobel Lecture as 
an opportunity to do something of less value, but which I cannot do elsewhere. 
I ask your indulgence in another manner. I shall include details of anecdotes 
which are of no value either scientifically, nor for understanding the develop- 
ment of ideas. They are included only to make the lecture more entertaining. 

I worked on this problem about eight years until the final publication in 
1947. The beginning of the thing was at the Massachusetts Institute of Tech- 
nology, when I was an undergraduate student reading about the known phys- 
ics, learning slowly about all these things that people were worrying about, 
and realizing ultimately that the fundamental problem of the day was that 
the quantum theory of electricity and magnetism was not completely satis- 
factory. This I gathered from books like those of Heitler and Dirac. I was in- 
spired by the remarks in these books; not by the parts in which everything 
was proved and demonstrated carefully and calculated, because I couldn’t 
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understand those very well. At the young age what I could understand were 
the remarks about the fact that this doesn’t make any sense, and the last sen- 
tence of the book of Dirac I can still remember, « It seems that some essentially 
new physical ideas are here needed. » So, I had this as a challenge and an in- 
spiration. I also had a personal feeling, that since they didn’t get a satisfactory 
answer to the problem I wanted to solve, I don’t have to pay a lot of attention 
to what they did do. 

I did gather from my readings, however, that two things were the source 
of the difficulties with the quantum electrodynamical theories. The first was 
an infinite energy of interaction of the electron with itself. And this difficulty 
existed even in the classical theory. The other difficulty came from some in- 
finites which had to do with the infinite numbers of degrees of freedom in the 
field. As I understood it at the time( as nearly as I can remember) this was simply 
the difficulty that if you quantized the harmonic oscillators of the field (say in a 
box) each oscillator has a ground state energy of ( 1/2) #w and there is an infinite 
number of modes in a box of every increasing frequency @, and therefore 
there is an infinite energy in the box. I now realize that that wasn’t a complete- 
ly correct statement of the central problem; it can be removed simply by 
changing the zero from which energy is measured. At any rate, I believed 
that the difficulty arose somehow from a combination of the electron acting 
on itself and the infinite number of degrees of freedom of the field. 

Well, it seemed to me quite evident that the idea that a particle acts on itself, 
that the electrical force acts on the same particle that generates it, is not a 
necessary one-it is a sort of a silly one, as a matter of fact. And, so I suggested 
to myself, that electrons cannot act on themselves, they can only act on other 
electrons. That means there is no field at all. You see, if all charges contribute 
to making a single common field, and if that common field acts back on all 
the charges, then each charge must act back on itself. Well, that was where the 
mistake was, there was no field. It was just that when you shook one charge, 
another would shake later. There was a direct interaction between charges, 
albeit with a delay. The law of force connecting the motion of one charge 
with another would just involve a delay. Shake this one, that one shakes later. 
The sun atom shakes; my eye electron shakes eight minutes later, because of a 
direct interaction across. 

Now, this has the attractive feature that it solves both problems at once. 
First, I can say immediately, I don’t let the electron act on itself, I just let this 
act on that, hence, no self-energy! Secondly, there is not an infinite number 
of degrees of freedom in the field. There is no field at all; or if you insist on 
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thinking in terms of ideas like that of a field, this field is always completely 
determined by the action of the particles which produce it. You shake this 
particle, it shakes that one, but if you want to think in a field way, the field, 
if it’s there, would be entirely determined by the matter which generates it, 
and therefore, the field does not have any independent degrees of freedom and 
the infinities from the degrees offreedom would then be removed. As a mat- 
ter of fact, when we look out anywhere and see light, we can always « see » 
some matter as the source of the light. We don’t just see light (except recently 
some radio reception has been found with no apparent material source). 

You see then that my general plan was to first solve the classical problem, 
to get rid of the infinite self-energies in the classical theory, and to hope that 
when I made a quantum theory of it, everything would just be fine. 

That was the beginning, and the idea seemed so obvious to me and so ele- 
gant that I fell deeply in love with it. And, like falling in love with a woman, it 
is only possible if you do not know much about her, so you cannot see her 
faults. The faults will become apparent later, but after the love is strong enough 
to hold you to her. So, I was held to this theory, in spite of all difficulties, by 
my youthful enthusiasm. 

Then I went to graduate school and somewhere along the line I learned 
what was wrong with the idea that an electron does not act on itself. When 
you accelerate an electron it radiates energy and you have to do extra work 
to account for that energy. The extra force against which this work is done is 
called the force of radiation resistance. The origin of this extra force was iden- 
tified in those days, following Lorentz, as the action of the electron itself The 
first term of this action, of the electron on itself, gave a kind of inertia (not 
quite relativistically satisfactory). But that inertia-like term was infinite for 
a point-charge. Yet the next term in the sequence gave an energy loss rate, 
which for a point-charge agrees exactly with the rate you get by calculating 
how much energy is radiated. So, the force of radiation resistance, which is 
absolutely necessary for the conservation of energy would disappear if I said 
that a charge could not act on itself. 

So, I learned in the interim when I went to graduate school the glaringly 
obvious fault of my own theory. But, I was still in love with the original 
theory, and was still thinking that with it lay the solution to the difficulties of 
quantum electrodynamics. So, I continued to try on and off to save it some- 
how. I must have some action develop on a given electron when I accelerate 
it to account for radiation resistance. But, if I let electrons only act on other 
electrons the only possible source for this action is another electron in the 


1965 RICHARD P. FEYNMAN 


world. So, one day, when I was working for Professor Wheeler and could no 
longer solve the problem that he had given me, I thought about this again and 
Icalculated the following. Suppose I have two charges-I shake the first 
charge, which I think of as a source and this makes the second one shake, but 
the second one shaking produces an effect back on the source. And so, I cal- 
culated how much that effect back on the first charge was, hoping it might 
add up the force of radiation resistance. It didn’t come out right, of course, 
but I went to Professor Wheeler and told him my ideas. He said, -yes, but 
the answer you get for the problem with the two charges that you just men- 
tioned will, unfortunately, depend upon the charge and the mass of the second 
charge and will vary inversely as the square of the distance R, between the 
charges, while the force ofradiation resistance depends on none of these things. 
1thought, surely, he had computed it himself, but now having become a pro- 
fessor, I know that one can be wise enough to see immediately what some 
graduate student takes several weeks to develop. He also pointed out some- 
thing that also bothered me, that if we had a situation with many charges all 
around the original source at roughly uniform density and if we added the 
effect of all the surrounding charges the inverse R square would be compen- 
sated by the R’in the volume element and we would get a result proportional 
to the thickness of the layer, which would go to infinity. That is, one would 
have an infinite total effect back at the source. And, finally he said to me, and 
you forgot something else, when you accelerate the first charge, the second 
acts later, and then the reaction back here at the source would be still later. In 
other words, the action occurs at the wrong time. I suddenly realized what a 
stupid fellow I am, for what I had described and calculated was just ordinary 
reflected light, not radiation reaction. 


But, as I was stupid, so was Professor Wheeler that much more clever. For 
he then went on to give a lecture as though he had worked this all out before 
and was completely prepared, but he had not, he worked it out as he went 
along. First, he said, let us suppose that the return action by the charges in the 
absorber reaches the source by advanced waves as well as by the ordinary re- 
tarded waves of reflected light; so that the law ofinteraction acts backward in 
time, as well as forward in time. I was enough of a physicist at that time not to 
say, « Oh, no, how could that be? » For today all physicists know from study- 
ing Einstein and Bohr, that sometimes an idea which looks completely para- 
doxical at first, if analyzed to completion in all detail and in experimental 
situations, may, in fact, not be paradoxical. So, it did not bother me any more 
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than it bothered Professor Wheeler to use advance waves for the back reaction 
-a solution of Maxwell’s equations, which previously had not been physically 
used. 


Professor Wheeler used advanced waves to get the reaction back at the right 
time and then he suggested this : If there were lots of electrons in the absorber, 
there would be an index of refraction n, so, the retarded waves coming from 
the source would have their wave lengths slightly modified in going through 
the absorber. Now, if we shall assume that the advanced waves come back 
from the absorber without an index-why? I don’t know, let’s assume they 
come back without an index-then, there will be a gradual shifting in phase 
between the return and the original signal so that we would only have to 
figure that the contributions act as if they come from only a finite thickness, 
that of the first wave zone. (More specifically, up to that depth where the 
phase in the medium is shifted appreciably from what it would be in vacuum, 
a thickness proportional to 4/(~—».) Now, the less the number of electrons 
in here, the less each contributes, but the thicker will be the layer that effec- 
tively contributes because with less electrons, the index differs less from 1. The 
higher the charges of these electrons, the more each contribute, but the thinner 
the effective layer, because the index would be higher. And when we estimat- 
ed it, (calculated without being careful to keep the correct numerical factor) 
sure enough, it came out that the action back at the source was completely 
independent of the properties of the charges that were in the surrounding ab- 
sorber. Further, it was of just the right character to represent radiation resis- 
tance, but we were unable to see if it was just exactly the right size. He sent 
me home with orders to figure out exactly how much advanced and how 
much retarded wave we need to get the thing to come out numerically right, 
and after that, figure out what happens to the advanced effects that you would 
expect if you put a test charge here close to the source? For if all charges gen- 
erate advanced, as well as retarded effects, why would that test not be affected 
by the advanced waves from the source? 

I found that you get the right answer if you use half-advanced and half- 
retarded as the field generated by each charge. That is, one is to use the solution 
of Maxwell’s equation which is symmetrical in time and that the reason we 
got no advanced effects at a point close to the source in spite of the fact that 
the source was producing an advanced field is this. Suppose the source s sur- 
rounded by a spherical absorbing wall ten light seconds away, and that the 
test charge is one second to the right of the source. Then the source is as much 
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as eleven seconds away from some parts of the wall and only nine seconds 
away from other parts. The source acting at time t= 0 induces motions in the 
wall at time + 10. Advanced effects from this can act on the test charge as 
early as eleven seconds earlier, or at f= - 1. This is just at the time that the 
direct advanced waves from the source should reach the test charge, and it 
turns out the two effects are exactly equal and opposite and cancel out! At 
the later time + 1effects on the test charge from the source and from the walls 
are again equal, but this time are of the same sign and add to convert the half- 
retarded wave of the source to full retarded strength. 

Thus, it became clear that there was the possibility that if we assume all 
actions are via half-advanced and half-retarded solutions of Maxwell’s equa- 
tions and assume that all sources are surrounded by material absorbing all the 
the light which is emitted, then we could account for radiation resistance as 
a direct action of the charges of the absorber acting back by advanced waves 
on the source. 

Many months were devoted to checking all these points. I worked to show 
that everything is independent of the shape of the container, and so on, that 
the laws are exactly right, and that the advanced effects really cancel in every 
case. We always tried to increase the efficiency of our demonstrations, and to 
see with more and more clarity why it works. I won’t bore you by going 
through the details of this. Because of our using advanced waves, we also had 
many apparent paradoxes, which we gradually reduced one by one, and saw 
that there was in fact no logical difficulty with the theory. It was perfectly satis- 
factory. 

We also found that we could reformulate this thing in another way, and 
that is by a principle of least action. Since my original plan was to describe 
everything directly in terms of particle motions, it was my desire to represent 
this new theory without saying anything about fields. It turned out that we 
found a form for an action directly involving the motions of the charges only, 
which upon variation would give the equations of motion of these charges. 
The expression for this action A is 


a s, mi { (xix)? da; +4 x ej ej | focsay Xi, («;) Xsu («;) daj da; (1) 
oF 
where 


yj? = [X'y (ai) ~ Xi, (@)][X', (@) — XA ,())] 


where Xi, («;) is the four-vector position of the i "particle as a function of 
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some parameter «}, Xin (a;) is dX, (a) / da;. The first term is the integral of 
proper time, the ordinary action of relativistic mechanics of free particles of 
mass m;. (We sum in the usual way on the repeated index L.) The second term 
represents the electrical interaction of the charges. It is summed over each pair 
of charges (the factor 1/2 is to count each pair once, the term i=j is omitted 
to avoid self- action) .The interaction is a double integral over a delta function 
of the square of space- time interval I? between two points on the paths. Thus, 
interaction occurs only when this interval vanishes, that is, along light cones. 

The fact that the interaction is exactly one- half advanced and half- retarded 
meant that we could write such a principle of least action, whereas interaction 
via retarded waves alone cannot be written in such a way. 

So, all of classical electrodynamics was contained in this very simple form. 
It looked good, and therefore, it was undoubtedly true, at least to the beginner. 
It automatically gave half- advanced and half-retarded effects and it was with- 
out fields. By omitting the term in the sum when i = j, J omit self-interaction 
and no longer have any infinite self-energy. This then was the hoped-for 
solution to the problem of ridding classical electrodynamics of the infinities. 

It turns out, of course, that you can reinstate fields if you wish to, but you 
have to keep track of the field produced by each particle separately. This is 
because to find the right field to act on a given particle, you must exclude the 
field that it creates itself. A single universal field to which all contribute will 
not do. This idea had been suggested earlier by Frenkel and so we called these 
Frenkel fields. This theory which allowed only particles to act on each other 
was equivalent to Frenkel’s fields using half- advanced and half-retarded solu- 
tions. 

There were several suggestions for interesting modifications of electro- 
dynamics. We discussed lots of them, but I shall report on only one. It was to 
replace this delta function in the interaction by another function, say, f( I ii), 
which is not infinitely sharp. Instead of having the action occur only when the 
interval between the two charges is exactly zero, we would replace the delta 
function of /’by a narrow peaked thing. Let’s say that f(Z) is large only near 
Z= o width of order a’. Interactions will now occur when T’- R’is of order 
a roughly where T is the time difference and Ris the separation of the charges. 
This might look like it disagrees with experience, but if a is some small dis- 
tance, like 10”cm, it says that the time delay Tin action is roughly ,/ R?+ a? 
or approximately,-if Ris much larger than a, T= R+a?/2R. This means 
that the deviation of time T from the ideal theoretical time R of Maxwell, gets 
smaller and smaller, the further the pieces are apart. Therefore, all theories 
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involving in analyzing generators, motors, etc., in fact, all of the tests of 
electrodynamics that were available in Maxwell’s time, would be adequately 
satisfied if a were 10" cm. If Ris of the order of a centimeter this deviation 
in Tis only 10*parts. So, it was possible, also, to change the theory in a 
simple manner and to still agree with all observations of classical electrody- 
namics. You have no clue of precisely what function to put in for f but it was 
an interesting possibility to keep in mind when developing quantum electro- 
dynamics. 

It also occurred to us that if we did that (replace 5 by f) we could not rein- 
state the term i =j in the sum because this would now represent in a relativis- 
tically invariant fashion a finite action of a charge on itself. In fact, it was pos- 
sible to prove that if we did do such a thing, the main effect of the self-action 
(for not too rapid accelerations) would be to produce a modification of the 
mass. In fact, there need be no mass m,, term, all the mechanical mass could 
be electromagnetic self-action. So, if you would like, we could also have an- 
other theory with a still simpler expression for the action A. In expression () 
only the second term is kept, the sum extended over all jand j, and some func- 
tion f replaces 6. Such a simple form could represent all of classical electro- 
dynamics, which aside from gravitation is essentially all of classical physics. 

Although it may sound confusing, I am describing several different alterna- 
tive theories at once. The important thing to note is that at this time we had 
all these in mind as different possibilities. There were several possible solu- 
tions of the difficulty of classical electrodynamics, any one of which might 
serve as a good starting point to the solution of the difficulties of quantum 
electrodynamics. 

I would also like to emphasize that by this time I was becoming used to a 
physical point of view different from the more customary point of view. In 
the customary view, things are discussed as a function of time in very great 
detail. For example, you have the field at this moment, a differential equation 
gives you the field at the next moment and so on; a method, which I shall call 
the Hamilton method, the time differential method. We have, instead (in (0 
say) a thing that describes the character of the path throughout all of space 
and time. The behavior of nature is determined by saying her whole space- 
time path has a certain character. For an action like () the equations obtained 
by variation (of Xi, («;)) are no longer at all easy to get back into Hamiltonian 
form. If you wish to use as variables only the coordinates of particles, then 
you can talk about the property of the paths- but the path of one particle at a 
given time is affected by the path of another at a different time. If you try to 
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describe, therefore, things differentially, telling what the present conditions 
of the particles are, and how these present conditions will affect the future- 
you see, it is impossible with particles alone, because something the particle 
did in the past is going to affect the future. 

Therefore, you need a lot of bookkeeping variables to keep track of what 
the particle did in the past. These are called field variables. You will, also, 
have to tell what the field is at this present moment, if you are to be able to see 
later what is going to happen. From the overall space- time view of the least 
action principle, the field disappears as nothing but bookkeeping variables in- 
sisted on by the Hamiltonian method. 

As a by-product of this same view, I received a telephone call one day at 
the graduate college at Princeton from Professor Wheeler, in which he said, 
« Feynman, I know why all electrons have the same charge and the same mass » 
« Why? » « Because, they are all the same electron! » And, then he explained 
on the telephone, « suppose that the world lines which we were ordinarily 
considering before in time and space-instead of only going up in time were a 
tremendous knot, and then, when we cut through the knot, by the plane 
corresponding to a fixed time, we would see many, many world lines and 
that would represent many electrons, except for one thing. If in one section 
this is an ordinary electron world line, in the section in which it reversed itself 
and is coming back from the future we have the wrong sign to the proper 
time - to the proper four velocities - and that’s equivalent to changing the 
sign of the charge, and, therefore, that part of a path would act like a positron. » 
« But, Professor », I said, « there aren’t as many positrons as electrons. » « Well, 
maybe they are hidden in the protons or something », he said. I did not take 
the idea that all the electrons were the same one from him as seriously as I 
took the observation that positrons could simply be represented as electrons 
going from the future to the past in a back section of their world lines. That, I 
stole ! 

To summarize, when I was done with this, as a physicist I had gained two 
things. One, I knew many different ways of formulating classical electro- 
dynamics, with many different mathematical forms. I got to know how to 
express the subject every which way. Second, I had a point ofview-the over- 
all space- time point of view-and a disrespect for the Hamiltonian method 
of describing physics. 

I would like to interrupt here to make a remark. The fact that electrodynam- 
ics can be written in so many ways-the differential equations of Maxwell, 
various minimum principles with fields, minimum principles without fields, 
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all different kinds of ways,was something I knew, but I have never understood. 
It always seems odd to me that the fundamental laws of physics, when dis- 
covered, can appear in so many different forms that are not apparently iden- 
tical at first, but, with a little mathematical fiddling you can show the relation- 
ship. An example of that is the Schrédinger equation and the Heisenberg 
formulation of quantum mechanics. I don’t know why this is -it remains a 
mystery, but it was something I learned from experience. There is always an- 
other way to say the same thing that doesn’t look at all like the way you said it 
before. I don’t know what the reason for this is. I think it is somehow a repre- 
sentation of the simplicity of nature. A thing like the inverse square law is just 
right to be represented by the solution of Poisson’s equation, which, there- 
fore, is a very different way to say the same thing that doesn’t look at all like 
the way you said it before. I don’t know what it means, that nature chooses 
these curious forms, but maybe that is a way of defining simplicity. Perhaps a 
thing is simple if you can describe it fully in several different ways without im- 
mediately knowing that you are describing the same thing. 

Iwas now convinced that since we had solved the problem of classical 
electrodynamics (and completely in accordance with my program from M. 
L.T., only direct interaction between particles, in a way that made fields un- 
necessary) that everything was definitely going to be all right. I was convinced 
that all I had to do was make a quantum theory analogous to the classical one 
and everything would be solved. 

So, the problem is only to make a quantum theory, which has as its classical 
analog, this expression (). Now, there is no unique way to make a quantum 
theory from classical mechanics, although all the textbooks make believe there 
is. What they would tell you to do, was find the momentum variables and re- 
place themby(&/i)(a/ax), but I couldn’t find a momentum variable, as there 
wasn’t any. 

The character of quantum mechanics of the day was to write things in the 
famous Hamiltonian way - in the form of a differential equation, which de- 
scribed how the wave function changes from instant to instant, and in terms of 
an operator, H.If the classical physics could be reduced to a Hamiltonian 
form, everything was all right. Now, least action does not imply a Hamilto- 
nian form if the action is a function of anything more than positions and veloc- 
ities at the same moment. If the action is of the form of the integral of a func- 
tion, (usually called the Lagrangian) of the velocities and positions at the same 
time 


S = JL(&, x) dt (2) 
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then you can start with the Lagrangian and then create a Hamiltonian and 
work out the quantum mechanics, more or lessuniquely. But this thing @) 
involves the key variables, positions, at two different times and therefore, it 
was not obvious what to do to make the quantum-mechanical analogue. 

itried - I would struggle in various ways. One of them was this; if I had 
harmonic oscillators interacting with a delay in time, I could work out what 
the normal modes were and guess that the quantum theory of the normal 
modes was the same as for simple oscillators and kind of work my way back 
in terms of the original variables. I succeeded in doing that, but I hoped then 
to generalize to other than a harmonic oscillator, but I learned to my regret 
something, which many people have learned. The harmonic oscillator is too 
simple; very often you can work out what it should do in quantum theory 
without getting much of a clue as to how to generalize your results to other 
systems. 

So that didn’t help me very much, but when I was struggling with this 
problem, I went to a beer party in the Nassau Tavern in Princeton. There was 
a gentleman, newly arrived from Europe (Herbert Jehle) who came and sat 
next to me. Europeans are much more serious than we are in America because 
they think that a good place to discuss intellectual matters is a beer party. So, 
he sat by me and asked, « what are you doing » and so on, and I said, « ’m 
drinking beer. » Then I realized that he wanted to know what work I was 
doing and I told him I was struggling with this problem, and I simply turned 
to him and said, ((listen, do you know any way of doing quantum mechanics, 
starting withaction - where the action integral comes into the quantum me- 
chanics? » « No », he said, « but Dirac has a paper in which the Lagrangian, at 
least, comes into quantum mechanics. I will show it to you tomorrow. » 

Next day we went to the Princeton Library, they have little rooms on the 
side to discuss things, and he showed me this paper. What Dirac said was the 
following : There is in quantum mechanics a very important quantity which 
carries the wave function from one time to another, besides the differential 
equation but equivalent to it, a kind of a kernal, which we might call K(x’, x), 
which carries the wave function y (x) known at time f, to the wave function 
y (x’) at time, t+e. Dirac points out that this function K was analogous to the 
quantity in classical mechanics that you would calculate if you took the ex- 
ponential of ie, multiplied by the Lagrangian L(x, x) imagining that these 
two positions x,x’ corresponded t and t +e. In other words, 


F x = 
K(x’, x) is analogous to e* LA, x)/A 
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Professor Jehle showed me this, I read it, he explained it to me, and I said, 
« what does he mean, they are analogous; what does that mean, analogous? 
What is the use of that? » He said, « you Americans ! You always want to find 
a use for everything! » I said, that I thought that Dirac must mean that they 
were equal. « No », he explained, « he doesn’t mean they are equal. » « Well », 
I said, « let’s see what happens if we make them equal. » 

So I simply put them equal, taking the simplest example where the Lag- 
rangian is !/, Mx?—V(x) but soon found I had to put a constant of propor- 
tionality A in, suitably adjusted. When I substituted Ae##£/4 for K to get 


p(x’, t+e) = [4 exp gL (*=*, x)) 96 t) dx (3) 


and just calculated things out by Taylor series expansion, out came the Schré- 
dinger equation. So, I turned to Professor Jehle, not really understanding, and 
said, « well, you see Professor Dirac meant that they were proportional. » Pro- 
fessor Jehle’s eyes were bugging out-he had taken out a little notebook and 
was rapidly copying it down from the blackboard, and said, « no, no,this is an 
important discovery. You Americans are always trying to find out how some- 
thing can be used. That’s a good way to discover things! » So, I thought I was 
finding out what Dirac meant, but, as a matter of fact, had made the discovery 
that what Dirac thought was analogous, was, in fact, equal. I had then, at least, 
the connection between the Lagrangian and quantum mechanics, but still 
with wave functions and infinitesimal times. 

It must have been a day or so later when I was lying in bed thinking about 
these things, that I imagined what would happen if I wanted to calculate the 
wave function at a finite interval later. 

Iwould put one of these factors e*£ in here, and that would give me the 
wave functions the next moment, t+é and then I could substitute that back 
into (3) to get another factor of e*£ and give me the wave function the next 
moment, [+ 2e, and so on and so on. In that way I found myself thinking of a 
large number of integrals, one after the other in sequence. In the integrand was 
the product of the exponentials, which, of course, was the exponential of the 
sum of terms like EL. Now, L is the Lagrangian and € is like the time interval 
dt, so that if you took a sum of such terms, that’s exactly like an integral. 
That’s like Riemann’s formula for the integral j Ldt, you just take the value 
at each point and add them together. We are to take the limit as € - 0, of 
course. Therefore, the connection between the wave function of one instant 
and the wave function of another instant a finite time later could be obtained 
by an infinite number of integrals, (because € goes to zero, of course) of ex- 
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ponential (iS /%) where S is the action expression (2). At last, I had succeeded 
in representing quantum mechanics directly in terms of the action S. 

This led later on to the idea of the amplitude for a path; that for each pos- 
sible way that the particle can go from one point to another in space-time, 
there’s an amplitude. That amplitude is e to the i/% times the action for the 
path. Amplitudes from various paths superpose by addition. This then is an- 
other, a third way, of describing quantum mechanics, which looks quite dif- 
ferent than that of Schrédinger or Heisenberg, but which is equivalent to 
them. 

Now immediately after making a few checks on this thing, what I wanted 
to do, of course, was to substitute the action (1) for the other (2). The first 
trouble was that I could not get the thing to work with the relativistic case of 
spin one-half. However, although I could deal with the matter only non- 
relativistically, I could deal with the light or the photon interactions perfectly 
well by just putting the interaction terms of (I) into any action, replacing the 
mass terms by the non-relativistic (Mx2/2)dt. When the action has a delay, as 
it now had, and involved more than one time, I had to lose the idea of a wave 
function. That is, I could no longer describe the program as; given the ampli- 
tude for all positions at a certain time to compute the amplitude at another 
time. However, that didn’t cause very much trouble. It just meant develop- 
ing a new idea. Instead of wave functions we could talk about this; that if a 
source of a certain kind emits a particle, and a detector is there to receive it, 
we can give the amplitude that the source will emit and the detector receive. 
We do this without specifying the exact instant that the source emits or the 
exact instant that any detector receives, without trying to specify the state of 
anything at any particular time in between, but by just finding the amplitude 
for the complete experiment. And, then we could discuss how that amplitude 
would change if you had a scattering sample in between, as you rotated and 
changed angles, and so on, without really having any wave functions. 

It was also possible to discover what the old concepts of energy and momen- 
tum would mean with this generalized action. And, so I believed that I had a 
quantum theory of classical electrodynamics-or rather of this new classical 
electrodynamics described by action (1). I made a number of checks. If I took 
the Frenkel field point of view, which you remember was more differential, I 
could convert it directly to quantum mechanics in a more conventional way. 
The only problem was how to specify in quantum mechanics the classical 
boundary conditions to use only half-advanced and half-retarded solutions. 
By some ingenuity in defining what that meant, I found that the quantum 


21 


22 


1965 ricuarD PRFEYNMAN 


mechanics with Frenkel fields, plus a special boundary condition, gave me 
back this action, ()in the new form of quantum mechanics with a delay. 
So, various things indicated that there wasn’t any doubt I had everything 
straightened out. 

It was also easy to guess how to modify the electrodynamics, if anybody 
ever wanted to modify it. I just changed the delta to an f, just as I would for 
the classical case. So, it was very easy, a simple thing. To describe the old re- 
tarded theory without explicit mention of fields I would have to write prob- 
abilities, not just amplitudes. I would have to square my amplitudes and that 
would involve double path integrals in which there are two S’s and so forth. 
Yet, as I worked out many of these things and studied different forms and dif- 
ferent boundary conditions. I got a kind of funny feeling that things weren’t 
exactly right. I could not clearly identify the difficulty and in one of the short 
periods during which I imagined I had laid it to rest, I published a thesis and 
received my Ph.D. 

During the war, I didn’t have time to work on these things very extensively, 
but wandered about on buses and so forth, with little pieces of paper, and 
struggled to work on it and discovered indeed that there was something 
wrong, something terribly wrong. I found that if one generalized the action 
from the nice Langrangian forms (2) to these forms () then the quantities 
which I defined as energy, and so on, would be complex. The energy values of 
stationary states wouldn’t be real and probabilities of events wouldn’t add 
up to 100%. That is, if you took the probability that this would happen and 
that would happen -everything you could think of would happen, it would 
not add up to one. 

Another problem on which I struggled very hard, was to represent rela- 
tivistic electrons with this new quantum mechanics. I wanted to do a unique 
and different way-and not just by copying the operators of Dirac into some 
kind of an expression and using some kind of Dirac algebra instead of ordinary 
complex numbers. I was very much encouraged by the fact that in one space 
dimension, I did find a way of giving an amplitude to every path by limiting 
myself to paths, which only went back and forth at the speed of light. The 
amplitude was simple (is) to a power equal to the number ofvelocity reversals 
where I have divided the time into steps € and I am allowed to reverse velocity 
only at such a time. This gives (as € approaches zero) Dirac’s equation in two 
dimensions-one dimension of space and one of time (#=M=c=y. 

Dirac’s wave function has four components in four dimensions, but in this 
case, it has only two components and this rule for the amplitude of a path 
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automatically generates the need for two components. Because if this is the 
formula for the amplitudes of path, it will not do you any good to know the 
total amplitude of all paths, which come into a given point to find the am- 
plitude to reach the next point. This is because for the next time, if it came in 
from the right, there is no new factor ie if it goes out to the right, whereas, if it 
came in from the left there was a new factor ie. So, to continue this same infor- 
mation forward to the next moment, it was not sufficient information to know 
the total amplitude to arrive, but you had to know the amplitude to arrive 
from the right and the amplitude to arrive to the left, independently. If you did, 
however, you could then compute both of those again independently and thus 
you had to carry two amplitudes to form a differential equation (first order in 
time). 

And, so I dreamed that if I were clever, I would find a formula for the am- 
plitude of a path that was beautiful and simple for three dimensions of space 
and one of time, which would be equivalent to the Dirac equation, and for 
which the four components, matrices, and all those other mathematical funny 
things would come out as a simple consequence-I have never succeeded in 
that either. But, I did want to mention some of the unsuccessful things on 
which I spent almost as much effort, as on the things that did work. 

To summarize the situation a few years after the way, I would say, I had 
much experience with quantum electrodynamics, at least in the knowledge 
of many different ways of formulating it, in terms of path integrals of actions 
and in other forms. One of the important by-products, for example, of much 
experience in these simple forms, was that it was easy to see how to combine 
together what was in those days called the longitudinal and transverse fields, 
and in general, to see clearly the relativistic invariance of the theory. Because 
of the need to do things differentially there had been, in the standard quantum 
electrodynamics, a complete split of the field into two parts, one of which 
is called the longitudinal part and the other mediated by the photons, or 
transverse waves. The longitudinal part was described by a Coulomb potential 
acting instantaneously in the Schrédinger equation, while the transverse part 
had entirely different description in terms of quantization of the transverse 
waves. This separation depended upon the relativistic tilt of your axes in space- 
time. People moving at different velocities would separate the same field into 
longitudinal and transverse fields in a different way. Furthermore, the entire 
formulation ofquantum mechanics insisting, as it did, on the wave function at 
a given time, was hard to analyze relativistically. Somebody else in a different 
coordinate system would calculate the succession of events in terms of wave 
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functions on differently cut slices of space- time, and with a different separa- 
tion of longitudinal and transverse parts. The Hamiltonian theory did not 
look relativistically invariant, although, of course, it was. One of the great 
advantages of the overall point of view, was that you could see the relativistic 
invariance right away-or as Schwinger would say- the covariance was mani- 
fest. I had the advantage, therefore, of having a manifestedly covariant form 
for quantum electrodynamics with suggestions for modifications and so on. I 
had the disadvantage that if I took it too seriously-I mean, if I took it seriously 
at all in this form,-1 got into trouble with these complex energies and the 
failure of adding probabilities to one and so on. I was unsuccessfully struggling 
with that. 

Then Lamb did his experiment, measuring the separation of the 2S4 and 
2P} levels of hydrogen, finding it to be about 1000 megacycles of frequency 
difference. Professor Bethe, with whom I was then associated at Comell, is a 
man who has this characteristic : If there’s a good experimental number you’ ve 
got to figure it out from theory. So, he forced the quantum electrodynamics 
of the day to give him an answer to the separation of these two levels. He 
pointed out that the self-energy of an electron itself is infinite, so that the 
calculated energy of a bound electron should also come out infinite. But, when 
you calculated the separation of the two energy levels in terms of the corrected 
mass instead of the old mass, it would turn out, he thought, that the theory 
would give convergent finite answers. He made an estimate of the splitting 
that way and found out that it was still divergent, but he guessed that was 
probably due to the fact that he used an unrelativistic theory of the matter. 
Assuming it would be convergent if relativistically treated, he estimated he 
would get about a thousand megacycles for the Lamb-shift, and thus, made 
the most important discovery in the history of the theory of quantum electro- 
dynamics. He worked this out on the train from Ithaca, New York to Schen- 
ectady and telephoned me excitedly from Schenectady to tell me the result, 
which I don’t remember fully appreciating at the time. 

Returning to Cornell, he gave a lecture on the subject, which I attended. 
He explained that it gets very confusing to figure out exactly which infinite 
term corresponds to what in trying to make the correction for the infinite 
change in mass. If there were any modifications whatever, he said, even 
though not physically correct, (that is not necessarily the way nature actually 
works) but any mod&cation whatever at high frequencies, which would 
make this correction finite, then there would be no problem at all to figuring 
out how to keep track of everything. You just calculate the finite mass correc- 
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tion 4m to the electron mass mp, substitute the numerical values of mp + 4m for 
min the results for any other problem and all these ambiguities would be 
resolved. If, in addition, this method were relativistically invariant, then we 
would be absolutely sure how to do it without destroying relativistically in- 
variant. 

After the lecture, I went up to him and told him, « I can do that for you, I'll 
bring it in for you tomorrow. » I guess I knew every way to modify quantum 
electrodynamics known to man, at the time. So, I went in next day, and ex- 
plained what would correspond to the modification of the delta-function to f 
and asked him to explain to me how you calculate the self-energy of an elec- 
tron, for instance, so we can figure out if it’s finite. 

I want you to see an interesting point. I did not take the advice of Professor 
Jehle to find out how it was useful. I never used all that machinery which I 
had cooked up to solve a single relativistic problem. I hadn’t even calculated 
the self-energy of an electron up to that moment, and was studying the dif- 
ficulties with the conservation of probability, and so on, without actually 
doing anything, except discussing the general properties of the theory. 

But now I went to Professor Bethe, who explained to me on the blackboard, 
as we worked together, how to calculate the self-energy of an electron. Up to 
that time when you did the integrals they had been logarithmically divergent. 
Itold him how to make the relativistically invariant modifications that I 
thought would make everything all right. We set up the integral which then 
diverged at the sixth power of the frequency instead of logarithmically! 

So, I went back to my room and worried about this thing and went around 
in circles trying to figure out what was wrong because I was sure physically 
everything had to come out finite, I couldn’t understand how it came out 
infinite. I became more and more interested and finally realized I had to learn 
how to make a calculation. So, ultimately, I taught myself how to calculate 
the self-energy of an electron working my patient way through the terrible 
confusion of those days of negative energy states and holes and longitudinal 
contributions and so on. When I finally found out how to do it and did it with 
the modifications I wanted to suggest, it turned out that it was nicely conver- 
gent and finite, just as I had expected. Professor Bethe and I have never been 
able to discover what we did wrong on that blackboard two months before, 
but apparently we just went off somewhere and we have never been able to 
figure out where. It turned out, that what I had proposed, if we had carried it 
out without making a mistake would have been all right and would have 
given a finite correction. Anyway, it forced me to go back over all this and to 
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convince myself physically that nothing can go wrong. At any rate, the cor- 
rection to mass was now finite, proportional to In (mu /%) where a is the width 
of that function f which was substituted for 6. If you wanted an unmodified 
electrodynamics, you would have to take a equal to zero, getting an infinite 
mass correction. But, that wasn’t the point. Keeping a finite, I simply followed 
the program outlined by Professor Bethe and showed how to calculate all the 
various things, the scatterings of electrons from atoms without radiation, the 
shifts of levels and so forth, calculating everything in terms of the experimen - 
tal mass, and noting that the results as Bethe suggested, were not sensitive to a 
in this form and even had a definite limit as a > 0. 

The rest of my work was simply to improve the techniques then available 
for calculations, making diagrams to help analyze perturbation theory 
quicker. Most of this was first worked out by guessing-you see, I didn’t have 
the relativistic theory of matter. For example, it seemed to me obvious that 
the velocities in non-relativistic formulas have to be replaced by Dirac’s 
matrix © or in the more relativistic forms by the operators yx. I just took my 
guesses from the forms that I had worked out using path integrals for non- 
relativistic matter, but relativistic light. It was easy to develop rules of what 
to substitute to get the relativistic case. I was very surprised to discover that 
it was not known at that time, that every one of the formulas that had been 
worked out so patiently by separating longitudinal and transverse waves could 
be obtained from the formula for the transverse waves alone, if instead of 
summing over only the two perpendicular polarization directions you would 
sum over all four possible directions of polarization. It was so obvious from 
the action () that I thought it was general knowledge and would do it all the 
time. I would get into arguments with people, because I didn’t realize they 
didn’t know that; but, it turned out that all their patient work with the longi- 
tudinal waves was always equivalent to just extending the sum on the two 
transverse directions of polarization over all four directions. This was one of 
the amusing advantages of the method. In addition, I included diagrams for 
the various terms of the perturbation series, improved notations to be used, 
worked out easy ways to evaluate integrals, which occurred in these problems, 
and so on, and made a kind of handbook on how to do quantum electrody- 
namics. 

But one step of importance that was physically new was involved with the 
negative energy sea of Dirac, which caused me so much logical difficulty. I got 
so confused that I remembered Wheeler’s old idea about the positron being, 
maybe, the electron going backward in time. Therefore, in the time depen- 
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dent perturbation theory that was usual for getting self-energy, I simply sup- 
posed that for a while we could go backward in the time, and looked at what 
terms I got by running the time variables backward. They were the same as 
the terms that other people got when they did the problem a more complicat- 
ed way, using holes in the sea, except, possibly, for some signs. These, I, at 
first, determined empirically by inventing and trying some rules. 

thave tried to explain that all the improvements of relativistic theory were 
at first more or less straightforward, semi-empirical shenanigans. Each time I 
would discover something, however, I would go back and I would check it 
sO many ways, compare it to every problem that had been done previously 
in electrodynamics (and later, in weak coupling meson theory) to see if it 
would always agree, and so on, until I was absolutely convinced of the truth 
of the various rules and regulations which I concocted to simplify all the work. 

During this time, people had been developing meson theory, a subject I 
had not studied in any detail. I became interested in the possible application 
of my methods to perturbation calculations in meson theory. But, what was 
meson theory? All I knew was that meson theory was something analogous 
to electrodynamics, except that particles corresponding to the photon had a 
mass. It was easy to guess the 6— function in (1), which was a solution of d’ Alem- 
bertian equals zero, was to be changed to the corresponding solution of d’A- 
lembertian equals m. Next, there were different kind of mesons-the one in 
closest analogy to photons, coupled via yxy», are called vector mesons- there 
were also scalar mesons. Well, maybe that corresponds to putting unity in 
place of the yz, I would here then speak of « pseudo vector coupling » and I 
would guess what that probably was. I didn’t have the knowledge to under- 
stand the way these were defined in the conventional papers because they 
were expressed at that time in terms of creation and annihilation operators, 
and so on, which, I had not successfully learned. I remember that when some- 
one had started to teach me about creation and annihilation operators, that this 
operator creates an electron, I said, « how do you create an electron? It dis- 
agrees with the conservation of charge », and in that way, I blocked my mind 
from learning a very practical scheme of calculation. Therefore, I had to find 
as many opportunities as possible to test whether I guessed right as to what the 
various theories were. 

One day a dispute arose at a Physical Society meeting as to the correctness 
of a calculation by Slotnick of the interaction of an electron with a neutron 
using pseudo scalar theory with pseudo vector coupling and also, pseudo scalar 
theory with pseudo scalar coupling. He had found that the answers were not 
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the same, in fact, by one theory, the result was divergent, although convergent 
with the other. Some people believed that the two theories must give the same 
answer for the problem. This was a welcome opportunity to test my guesses 
as to whether I really did understand what these two couplings were. So, I 
went home, and during the evening I worked out the electron neutron scat- 
tering for the pseudo scalar and pseudo vector coupling, saw they were not 
equal and subtracted them, and worked out the difference in detail. The 
next day at the meeting, I saw Slotnick and said, « Slotnick, I worked it out 
last night, I wanted to see if I got the same answers you do. I got a different 
answer for each coupling-but, I would like to check in detail with you be- 
cause I want to make sure of my methods. » And, he said, « what do you mean 
you worked it out last night, it took me six months ! » And, when wecompared 
the answers he looked at mine and he asked, « what is that Q in there, that 
variable Q? » (I had expressions like (tan -'Q) /Q etc.). I said, « that’s the mo- 
mentum transferred by the electron, the electron deflected by different angles. » 
« Oh », he said, « no, I only have the limiting value as Q approaches zero; the 
forward scattering. » Well, it was easy enough to just substitute Q equals zero 
in my form and I then got the same answers as he did. But, it took him six 
months to do the case of zero momentum transfer, whereas, during one eve- 
ning I had done the finite and arbitrary momentum transfer. That was a thrill- 
ing moment for me, like receiving the Nobel Prize, because that convinced 
me, at last, I did have some kind of method and technique and understood 
how to do something that other people did not know how to do. That was my 
moment of triumph in which I realized I really had succeeded in working out 
something worthwhile. 

At this stage, I was urged to publish this because everybody said it looks like 
an easy way to make calculations, and wanted to know how to do it. I had to 
publishit, missing two things; one was proof of every statement in a mathemat- 
ically conventional sense. Often, even in a physicist’s sense, I did not have a 
demonstration of how to get all of these rules and equations from conventio- 
nal electrodynamics. But, I did know from experience, from fooling around, 
that everything was, in fact, equivalent to the regular electrodynamics and 
had partial proofs of many pieces, although, I never really sat down, like 
Euclid did for the geometers of Greece, and made sure that you could get it 
all from a single simple set of axioms. As a result, the work was criticized, I 
don’t know whether favorably or unfavorably, and the « method » was called 
the aintuitive method)). For those who do not realize it, however, I should 
like to emphasize that there is a lot of work involved in using this <<intuitive 
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method>> successfully. Because no simple clear proof of the formula or idea 
presents itself, it is necessary to do an unusually great amount of checking and 
rechecking for consistency and correctness in terms of what is known, by com- 
paring to other analogous examples, limiting cases, etc. In the face of the lack 
of direct mathematical demonstration, one must be careful and thorough to 
make sure of the point, and one should make a perpetual attempt to demon- 
strate as much of the formula as possible. Nevertheless, a very great deal more 
truth can become known than can be proven. 

It must be clearly understood that in all this work, I was representing the 
conventional electrodynamics with retarded interaction, and not my half- 
advanced and half-retarded theory corresponding to ().I merely use (i) to 
guess at forms. And, one of the forms I guessed at corresponded to changing 6 
toa function f of width a’, so that I could calculate finite results for all of the 
problems. This brings me to the second thing that was missing when I publish- 
ed the paper, an unresolved difficulty. With 65 replaced by f the calculations 
would give results which were not « unitary », that is, for which the sum of the 
probabilities of all alternatives was not unity. The deviation from unity was 
very small, in practice, if a was very small. In the limit that I took a very tiny, 
it might not make any difference. And, so the process of the renormalization 
could be made, you could calculate everything in terms of the experimental 
mass and then take the limit and the apparent difficulty that the unitary is 
violated temporarily seems to disappear. I was unable to demonstrate that, as 
a matter of fact, it does. 

It is lucky that I did not wait to straighten out that point, for as far as 1 know, 
nobody has yet been able to resolve this question. Experience with meson 
theories with stronger couplings and with strongly coupled vector photons, 
although not proving anything, convinces me that if the coupling were 
stronger, or if you went to a higher order ( 137th order of perturbation theory 
for electrodynamics), this difficulty would remain in the limit and there 
would be real trouble. That is, I believe there is really no satisfactory quantum 
electrodynamics, but I’m not sure. And, I believe, that one of the reasons for 
the slowness of present-day progress in understanding the strong interactions 
is that there isn’t any relativistic theoretical model, from which you can really 
calculate everything. Although, it is usually said, that the difficulty lies in the 
fact that strong interactions are too hard to calculate, I believe, it is really be- 
cause strong interactions in field theory have no solution, have no sense- 
they’ re either infinite, or, if you try to modify them, the modification destroys 
the unitarity. I don’t think we have a completely satisfactory relativistic quan- 
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turn- mechanical model, even one that doesn’t agree with nature, but, at least, 
agrees with the logic that the sum of probability of all alternatives has to be 
100%. Therefore, I think that the renormalization theory is simply a way to 
sweep the difficulties of the divergences of electrodynamics under the rug. 1 
am, of course, not sure of that. 

This completes the story of the development of the space-time view of 
quantum electrodynamics. I wonder if anything can be learned from it. I 
doubt it. It is most striking that most of the ideas developed in the course of 
this research were not ultimately used in the final result. For example, the 
half-advanced and half-retarded potential was not finally used, the action 
expression (1) was not used, the idea that charges do not act on themselves 
was abandoned. The path-integral formulation of quantum mechanics was 
useful for guessing at final expressions and at formulating the general theory 
of electrodynamics in new ways-although, strictly it was not absolutely 
necessary. The same goes for the idea of the positron being a backward 
moving electron, it was very convenient, but not strictly necessary for the 
theory because it is exactly equivalent to the negative energy sea point of 
view. 

We are struck by the very large number of different physical viewpoints and 
widely different mathematical formulations that are all equivalent to one an- 
other. The method used here, ofreasoning in physical terms, therefore, appears 
to be extremely inefficient. On looking back over the work, I can only feel a 
kind of regret for the enormous amount of physical reasoning and mathe- 
matically re-expression which ends by merely re-expressing what was pre- 
viously known, although in a form which is much more efficient for the cal- 
culation of specific problems. Would it not have been much easier to simply 
work entirely in the mathematical framework to elaborate a more efficient 
expression? This would certainly seem to be the case, but it must be remarked 
that although the problem actually solved was only such a reformulation, the 
problem originally tackled was the (possibly still unsolved) problem of avoid- 
ante of the inifinities of the usual theory. Therefore, a new theory was sought, 
not just a modification of the old. Although the quest was unsuccessful, we 
should look at the question of the value of physical ideas in developing a new 
theory. 

Many different physical ideas can describe the same physical reality. Thus, 
classical electrodynamics can be described by a field view, or an action at a 
distance view, etc. Originally, Maxwell filled space with idler wheels, and 
Faraday with fields lines, but somehow the Maxwell equations themselves are 
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pristine and independent of the elaboration of words attempting a physical 
description. The only true physical description is that describing the experi- 
mental meaning of the quantities in the equation-or better, the way the 
equations are to be used in describing experimental observations. This being 
the case perhaps the best way to proceed is to try to guess equations, and dis- 
regard physical models or descriptions. For example, McCullough guessed 
the correct equations for light propagation in a crystal long before his col- 
leagues using elastic models could make head or tail of the phenomena, or 
again, Dirac obtained his equation for the description of the electron by an 
almost purely mathematical proposition. A simple physical view by which all 
the contents of this equation can be seen is still lacking. 

Therefore, I think equation guessing might be the best method to proceed 
to obtain the laws for the part of physics which is presently unknown. Yet, 
when I was much younger, I tried this equation guessing and I have seen 
many students try this, but it is very easy to go off in wildly incorrect and im- 
possible directions. 1 think the problem is not to find the best or most efficient 
method to proceed to a discovery, but to find any method at all. Physical 
reasoning does help some people to generate suggestions as to how the un- 
known may be related to the known. Theories of the known, which are de- 
scribed by different physical ideas may be equivalent in all their predictions 
and are hence scientifically indistinguishable. However, they are not psycho- 
logically identical when trying to move from that base into the unknown. For 
different views suggest different kinds of modifications which might be made 
and hence are not equivalent in the hypotheses one generates from them in 
ones attempt to understand what is not yet understood. I, therefore, think 
that a good theoretical physicist today might find it useful to have a wide range 
of physical viewpoints and mathematical expressions of the same theory (for 
example, of quantum electrodynamics) available to him. This may be asking 
too much of one man. Then new students should as a class have this. If every 
individual student follows the same current fashion in expressing and think- 
ing about electrodynamics or field theory, then the variety of hypotheses 
being generated to understand strong interactions, say, is limited. Perhaps 
rightly so, for possibly the chance is high that the truth lies in the fashionable 
direction. But, on the off-chance that it is in another direction-a direction 
obvious from an unfashionable view of field theory-who will find it? Only 
someone who has sacrificed himself by teaching himself quantum electro- 
dynamics from a peculiar and unusual point of view; one that he may have to 
invent for himself. 1say sacrificed himself because he most likely will get 
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nothing from it, because the truth may lie in another direction, perhaps even 
the fashionable one. 

But, if my own experience is any guide, the sacrifice is really not great be- 
cause if the peculiar viewpoint taken is truly experimentally equivalent to the 
usual in the realm of the known there is always a range of applications and 
problems in this realm for which the special viewpoint gives one a special 
power and clarity of thought, which is valuable in itself. Furthermore, in the 
search for new laws, you always have the psychological excitement of feeling 
that possible nobody has yet thought of the crazy possibility you are looking 
at right now. 

So what happened to the old theory that I fell in love with as a youth? 
Well, I would say it’s become an old lady, that has very little attractive left in 
her and the young today will not have their hearts pound when they look at 
her anymore. But, we can say the best we can for any old woman, that she 
has been a very good mother and she has given birth to some very good chil- 
dren. And, I thank the Swedish Academy of Sciences for complimenting one 
of them. Thank you. 
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While still an undergraduate at MIT, Feynman became aware of the so-called divergence 
problems of QED. That is, physical quantities which should have been calculable by the 
theory, such as the self-mass of the electron (the effect of the action of the electron’s own 
electromagnetic field on its mass), were predicted to have an absurd result: infinity. Feynman 
knew that a similar result was predicted classically; namely, the energy contained in the 
Coulomb field of a point charge is theoretically infinite. As he said in his Nobel Lecture, his 
“general plan was to first solve the classical problem, to get rid of the infinite self-energies 
and to hope that when I made a quantum theory of it, everything would just be fine.” The 
idea which he embraced (“fell deeply in love with”) was to replace the field itself by “delayed 
action-at-a-distance.” In this view the electron would act only on other charges, not on itself, 
and the field would be only a useful invention for representing that delayed interaction. He 
abandoned this idea after an accurate experimental value obtained for the Lamb shift in 
hydrogen in the late 1940s showed the presence of an effect called “vacuum polarization,” 
which could only be obtained by using the full field concept.! 

Papers [4] and [10] are, respectively, Part 3 and Part 2 of a three-part paper projected 
by John Archibald Wheeler and Richard Phillips Feynman. Part 1, never published (and 
probably never written), was to have been a careful study of the classical limit of the quantum 
theory of radiation. Paper [4] introduces the absorber theory, according to which half of the 
electromagnetic field propagates before the electron emitting it accelerates (advanced) and 
half as it accelerates (retarded). The advanced field is assumed to be absorbed in distant 
matter, where it would reradiate and arrive at the accelerating electron at the right time 
and in the right amount to produce the “radiation reaction” that is needed to reduce the 
radiating electron’s kinetic energy by the amount of energy that it is radiates. There are 
no observable “advanced effects.” That was the solution to the problem of lack of energy 
conservation that would result (as Wheeler had pointed out to Feynman) if the electron’s 
radiated field did not act back upon the electron. Edward Kerner has remarked that the 
““complete absorption’ in the electromagnetic universe is a kind of electrodynamic Mach’s 
principle accounting marvelously for the appearance of the Lorentz—Dirac force of radiation 
damping, and for the appearance of retarded interactions on the local scene.”? 

Paper [10], “Classical Electrodynamics in Terms of Direct Interparticle Interaction,” was 
published in 1949 while Feynman was deeply involved in his work on quantum electrodynam- 
ics. Like [4] it is a scholarly paper, published in the Reviews of Modern Physics. Remarking 
on it in an interview, Feynman said, “That was written by Wheeler, and was done essentially 
independently. We worked together.”? By this he meant that the contents were worked out 
jointly with Wheeler, who did the actual writing. The paper continues the critique of clas- 
sical electrodynamics begun in [4], based upon this idea: The field of a charge is determined 
by its motion; its field is only sensed by its action on other charges, whose motions act back 
upon the first charge. Thus it should be possible to eliminate the field and to discuss directly 


In a letter to John Wheeler in 1951, Feynman wrote, “I wish to deny the correctness of the assumption that” 
electrons act only on other electrons, citing two pieces of evidence, one being the Lamb shift. He concluded 
the letter thus: “So I think we guessed wrong in 1941. Do you agree?” (Feynman to Wheeler, May 4, 1951). 
?B.H. Kerner, ed., The Theory of Action-at-a-Distance in Relativistic Particle Dynamics. New York, 1972, 
pp. viii-ix. 

3Interview of Feynman by Charles Weiner, June 27, 1966, p. 39. 
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how the motion of one charge affects the motion of another. This can be done by writing 
the relativistic expression for the principle of least action, which determines the equations of 
motion of the charges (Fokker’s action principle). The last requires the use of half-advanced 
and half-retarded four-vector potentials, and this leads to a discussion on the “paradox of 
advanced effects.” 

Paper [8] is included in this section because it is a further step in Feynman’s plan to 
modify classical electrodynamics as a forerunner to attacking the problems of QED. Again it 
uses the action-at-a-distance approach, half-advanced and half-retarded interaction, and the 
Fokker action principle, although Feynman points out that his modification of the classical 
“pointlike” interaction could also be applied to the conventional electrodynamics. However, 
the latter makes use of the Hamiltonian method that singles out the time as a preferred 
variable, making it difficult to construct a relativistic theory, which is more symmetrical 
in time and space. Indeed, using the Hamiltonian requires keeping track of an infinity of 
variables on a plane of constant time in space-time (or on a spacelike surface). That would 
be at least as complicated as the field concept that Feynman is trying to eliminate. 

The solution invoked by H.A. Lorentz to the classical electron self-energy problem at 
the turn of the century was to give the electron a finite size. In [8], Feynman introduces an 
equivalent relativistic “cut-off” that spreads out the interaction, conventionally occurring 
only on the light-cone (“pointlike” interaction), over a small timelike interval. This finite 
interval can be chosen as small as one wishes, and it would in principle be possible to 
determine it experimentally at high energy. The paper also discusses the least action solution 
to the problem of an electron striking a barrier and penetrating it, either directly, or indirectly 
by a process involving the production of a virtual positron electron pair. It introduces the 
forerunner of the Feynman diagrams, containing an electron moving “backward in time” to 
represent the positron! 
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[4] With J.A. Wheeler. Interaction with the absorber as the mechanism of radiation, Rev. 
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Interaction with the Absorber as the Mechanism 
of Radiationt* 
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Palmer Physical Laboratory, Princeton University, Princeton, New Jersey 


“We must, therefore, be prepated to find that further advance into this region will require 
a still more extensive renunciation of features which we are accustoined to demand of the 


space time mode of description.”"—Nicls Bohr! 


PAST FAILURE OF ACTION AT A DISTANCE TO 
ACCOUNT FOR THE MECHANISM OF RADIATION 


T was the 19th of March in 1845 when Gauss 
described the conception of an action at a 
distance propagated with a finite velocity, the 
natural generalization to electrodynamics of the 
view of force so fruitfully applied by Newton and 
his followers. In the century between then and 
now what obstacle has discouraged the general 
use of this conception in the study of nature? 
The difficulty has not been that of giving to 
the idea of propagated action at a distance a 


* A preliminary account of the considerations which 
appear in this paper was presented by us at the Cambridge 
meeting of the American Physical Society, February 21, 
1941, Phys. Rev. 59, 683 (1941) 

** On leave of absence from Princeton University. 

*** Now a member of the faculty of Cornell University, 
but on leave of absence from that institution. 

{ Introductory Note—In commemoration of the sixtieth 
birthday of Niels Bohr it had been hoped to present a 
critique of classical field theory which has been in prepara- 
tion since before the war by the writer and his former 
student, R. P. Feynman. The accompanying joint article, 
representing the third part of the survey, is however the 
only section now finished. The war has postponed comple- 
tion of the other parts. As reference to them is made in 
the present section, it may be useful to outline the plan of 
the survey. 

The motive of the analysis is to clear the present 
quantum theory of interacting particles of those of its 
difficulties which have a purely classical origin. The 
method of approach is to define as closely as one can 
within the bounds of classical theory the proper use of the 
field concept in the description of nature. Division I is 
intended first to recall the possibility of idealizing to the 
case of arbitrarily small quantum effects, a possibility 
which is offered by the freedom of choice in the present 
quantum theory for the dimensionless ratio (quantum of 
angular momentum) (velocity of light)/(electronic charge)?; 
then however to recognize the possible limitations placed 
on this analysis by the relatively large value, 137, of the 
ratio in question in nature; and finally to present a general 
summary of the conclusions drawn from the more technical 
parts of the survey. The plan of the second article is a 
derivation and resumé of the theory of action at a distance 
of Schwarzschild and Fokker, to prepare this theory as a 
tool to analyze the field concept. From the correlation of 
the two points of view, one comes to Frenkel’s solution of 
the problem of self-energy in the classical field theory and 


suitable embodiment of electromagnetic equa- 
tions. This problem, to be true, remained un- 
solved to Gauss and his successors for three 
quarters of the century. But the formulation 
then developed by Schwarzschild and Fokker, 
described and amplified in another article,? 
demonstrated that the conception of Gauss is at 
the same time mathematically self consistent, 
in agreement with experience on static and 
current electricity, and in complete harmony 
with Maxwell’s equations. 

To find the real obstacle to acceptance of the 
tool of Newton and Gauss for the analysis of 
forces, we have to go beyond the bounds of 
steady-state electromagnetism to the phenomena 
of emission and propagation of energy. No 
branch of science has done more than radiation 
physics to favor the evolution of present concepts 
of field or more to pose difficulties for the idea of 
action at a distance. The difficulties have been 
twofold—to obtain a satisfactory account of the 
field generated by an accelerated charge at a 


to new expressions for the energy of electromagnetic 
interaction in the theory of action at a distance. The third 
division, which is published herewith, is an analysis of the 
mechanism of radiation believed to complete the last tic 
between action at a distance and field theory and to 
remove the obstacle which has so far prevented the use of 
both points of view as complementary tools in the de- 
scription of nature. It is the plan of a subsequent division 
to discuss the problems which arise when the fields are 
regarded as subordinate entities with no degrees of freedom 
of their own, An infinite number of degrees of freedom are 
found to be attributed to the particles themselves by the 
theory of propagated action at a distance. However, it 
appears that the additional modes of motion are divergent 
and have on this account to be excluded by a general 
principle of selection. Acceptance of this principle leads 
to the conclusion that the union of action at a distance 
and field theory constitutes the natural and self-consistent 
generalization of Newtonian mechanics to the four- 
dimensional space of Lorentz and Einstein —J. A. W. 

1 Niels Bohr, Atomic Theory and the Description of Nature 
(Cambridge University Press, Teddington, England, 1934). 

2 Unpublished, see Introductory Note. 
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remote point and to understand the source of 
the force experienced by the charge itself as a 
result of its motion: 


(a) An accelerated charge generates a field given, ac- 
cording to the formulation of Schwarzschild and Fokker, 
by half the usual retarded solution of Maxwell’s equations, 
plus half the advanced solution. From the presence of the 
advanced field in the expression for the electric vector, it 
follows that a distant test body will experience a premoni- 
tory force well before the source itself has commenced to 
move. To avoid a conclusion so opposed to experience 
Ritz? and Tetrode* proposed to'abandon the symmetry in 
time of the elementary law of force. However, it was then 
necessary to give up the possibility to derive the equations 
of motion and all the electromagnetic forces consistently 
from a single unified principle of least action like that of 
Fokker. More important, the sacrifice made to alleviate 
one difficulty of the theory of action at a distance did not 
help to solve the other, the problem of the origin of the 
force of radiative reaction. 

(b) Experience indicates that an accelerated charge 
suffers a force of damping which is simultaneous with the 
moment of acceleration. However, the theory of action at 
a distance predicts that an accelerated charge in otherwise 
charge-free space will experience no electric force. Tu 
exclude the acceleration and thus to avoid the issue does 
not appear reasonable. Uncharged particles can be present 
and can accelerate the charge via gravitational forces. It 
seems just as difficult to explain the reactive force when 
other charged particles are present. They will indeed be 
set into motion and will act back on the source. However, 
if these elementary interactions have the purely retarded 
character assumed by Ritz, and also by Frenkel,’ the 
reaction will arrive at the accelerated particle too late and 
will have the wrong magnitude® to produce the damping 
phenomenon. On the other hand, interactions symmetrical 
between past and future—the half-retarded, half-advanced 
fields of the unified theory of action at a distance—have so 
far appeared to be equally incapable of accounting for the 
observed force of radiative reaction, with its definitely 
irreversible character. 


It is clear why the viewpoint of Newton and 
Gauss has not been generally applied in recent 
times; it has so far failed to give a satisfactory 
account of the mechanism of radiation. 

The failure of action at a distance cannot pass 
unnoticed by field theory. The two points of 
view, according to the thesis of the present 
critique, are not independent, but mutually 
complementary. Consequently field theory, too, 
faces in the radiation problem a significant issue: 


2W., Ritz, Ann. d. Chem. et d. Physique 13, 145 (1908). 
4H. Tetrode, Zeits. f. Physik 10, 317 (1922). 
5 J, Frenkel, Zeits. f. Physik 32, 518 (1925). 
¢J. L. Synge, Proc. Roy. Soc. London A177, 118 (1940). 
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does this theory give an explanation for the 
observed force of radiative reaction which can 
be translated into the particle mechanics of 
Schwarzschild and Fokker, or does it likewise 
fail to provide a complete picture of the mecha- 
nism of radiation? 

In attacking the radiation problem our first 
move, following the above reasoning, is to review 
the status of the reaction force in existing classi- 
cal field theory. No more intelligible clue is found 
to the physical origin of the force in this theory 
than in the theory of action at a distance. 
Stopped on this approach, we take up a sug- 
gestion made long ago by Tetrode that the act 
of radiation should have some connection with 
the presence of an absorber. We develop this 
idea into the thesis that the force of radiative 
reaction arises from the action on the source 
owing to the half-advanced fields of the particles 
of the absorber; or, more briefly, that radiation 
is a matter as much of statistical mechanics as 
of pure electrodynamics. We find that this thesis 
leads to a quantitative solution of the radiation 
problem. Finally we examine some of the impli- 
cations of this thesis for the conception of 
causality. 


THE STATUS OF RADIATIVE REACTION IN 
FIELD THEORY 


A charged particle on being accelerated sends 
out electromagnetic energy and itself loses 
energy. This loss is interpreted as caused by a 
force acting on the particle given in magnitude 
and direction by the expression 


2 (charge)? (time rate of change of acceleration), 
3 (velocity of light)® 


when the particle is moving slowly, and by a 
more complicated expression when its speed is 
appreciable relative to the velocity of light. The 
existence of this force of radiative reaction is 
well attested: (a) by the electrical potential 
required to drive a wireless antenna; (b) by the 
loss of energy experienced by a charged particle 
which has been deflected, and therefore acceler- 
ated, in its passage near an atomic nucleus; and 
(c) by the cooling suffered by a glowing body. 

The origin of the force of radiative reaction 
has not been nearly so clear as its existence. 


ABSORBER THEORY OF RADIATION 


Lorentz’ considers the charged particle to have 
a finite size and attributes the force in question 
to the retarded action of one part of the particle 
on another. His expression for the force appears 
as a series in powers of the radius of the particle. 
The first term in the series gives the expression 
already mentioned. Otherwise, the derivation 
leads to difficulties: 


(a) All higher terms depend explicitly upon the structure 
assumed for the entity. These dubious terms enter in a 
more and more important way into the calculated law of 
radiative reaction as the frequency of oscillation of the 
particle is raised, and the period approaches the time 
required for light to cross the system. 

(b) Non-electric forces are required to hold together the 
charge distribution, according to Poincaré,’ for to neglect 
such forces is to violate the relativistic relation between 
mass and energy. A composite system of this kind would 
possess an infinite number of internal degrees of freedom 
of oscillation. No consistent model has been found for the 
Lorentz electron in either classical or quantum mechanics, 


Briefly, Lorentz attempts to propose a physical 
mechanism behind the radiative reaction, but 
arrives at a mathematically incomplete expres- 
sion for this force. 

Dirac,? in contrast, advances no explanation 
for the origin of the radiative damping, but 
supplies a well-defined and relativistically in- 
variant prescription to calculate its magnitude: 


Let the motion of the particle be given. Calculate the 
field produced by the particle from Maxwell’s equations, 
with the boundary condition that at large distances from 
the particle this field shall contain only outgoing waves. 
In addition to the so-defined retarded field of the particle, 
calculate its advanced field, the sole change being the 
existence of only convergent waves at large distances. 
Define half the difference between retarded and advanced 
fields as the radiation field (half the quantity denoted as 
radiation field by Dirac). This field is everywhere finite. 
Evaluate it at the position of the particle and multiply by 
the magnitude of the charge to obtain the force of radiative 
reaction. 

Dirac’s prescription is appealing. (a) It is well-defined. 
(b) The calculated force reduces for slowly moving particles 
to the simple expression which was given above and which 
has been well-tested at non-relativistic velocities. (c) The 
calculation treats the elementary charge as being localized 
at a mathematical point, a picture which is not only 
physically reasonable but also translatable into quantum 


TH. A, Lorentz (1892), republished in his Collected 
Papers, Vol. II, pp. 28i and 343. See also his treatise 
The Theory of Electrons (Leipzig, 1909), pp. 49 and 253. 

8H. Poincaré, Rend. Palermo 21, 165 (1906). 


SP. A. M. Dirac, Proc. Roy. Soc. London A167, 148 
(1938). 
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mechanics. (d) The elements of the prescription involve no 
more than standard electromagnetic theory plus the as- 
sumption that the radiation field, as above defined, is the 
source of the force. 

The physical origin of Dirac’s radiation field is never- 
theless not clear. (a) This field is defined for times before 
as well as after the moment of acceleration of the particle. 
(b) The field has no singularity at the position of the 
particle and by Maxwell's equations must, therefore, be 
attributed either to sources other than the charge itself 
or to radiation coming in from an infinite distance. 


We accept as reasonable Dirac’s results. His 
concept of radiation field, however, we cannot 
adopt as an assumption subject to no further 
analysis. To do so would be to add to field theory 
a principle incapable of translation into the 
language of action at a distance. 

To carry the analysis further requires us to 
find a new idea. We go back to a suggestion once 
made by Tetrode.’® He proposed to abandon the 
conception of electromagnetic radiation as an 
elementary process and to interpret it as a con- 
sequence of an interaction between a source and 
an absorber. In his words, 


“The sun would not radiate if it were alone in space and 
no other bodies could absorb its radiation. ... If for 
example I observed through my telescope yesterday eve- 
ning that star which let us say is 100 light years away, then 
not only did I know that the light which it allowed to 
reach my eye was emitted 100 years ago, but also the star 
or individual atoms of it knew already 100 years ago that I, 
who then did not even exist, would view it yesterday 
evening at such and such a time. ... One might ac- 
cordingly adopt the opinion that the amount of material 
in the universe determines the rate of emission. Still this 
is not necessarily so, for two competing absorption centers 


10H. Tetrode, Zeits. f. Physik 10, 317 (1922). When we 
gave a preliminary account of the considerations which 
appear in this paper (Cambridge meeting of the American 
Physical Society, February 21, 1941, Phys. Rev. 59, 683 
(1941)) we had not seen Tetrode's paper, We are indebted 
to Professor Einstein for bringing to our attention the 
ideas of Tetrode and also of Ritz, who is cited in this 
article. An idea similar to that of Tetrode was subsequently 
proposed by G. N. Lewis, Nat. Acad. Sci. Proc. 12, 22 
(1926): ‘I am going to make the . . . assumption that 
an atom never emits light except to another atom, and to 
claim that it is as absurd to think of light emitted by one 
atom regardless of the existence of a receiving atom as it 
would be to think of an atom absorbing light without the 
existence of light to be absorbed. I propose to eliminate 
the idea of mere emission of light and substitute the idea of 
transmission, or a process of exchange of energy between 
two definite atoms or molecules.” Lewis went nearly as far 
as it is possible to go without explicitly recognizing the 
importance of other absorbing matter in the system, a 
point touched upon by Tetrode, and shown below to be 
essential for the existence of the normal radiative mech- 
anism. 
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will not collaborate but will presumably interfere with 
each other. If only the amount of matter is great enough 
and is distributed to some extent in all directions, further 
additions to it may well be without influence.”’ 

Tetrode’s idea that the absorber may be an essential 
element in the mechanism of radiation has been neglected, 
perhaps partly because it appears to conflict with custom- 
ary notions of causality, and partly also because of his 
mistaken belief that the new point of view could by itself 
explain quantum phenomena. In this connection he as- 
sumed that the interaction between charged particles 
should be described by forces more complicated than those 
given by electromagnetic theory. Finally, as Tetrode 
remarks, “‘on the last pages we have let our conjectures 


go rather far beyond what has mathematically been 
proven.” 


ABSORBER RESPONSE AS THE MECHANISM OF 
RADIATIVE REACTION 


We take up the proposal of Tetrode that 
the absorber may be an essential element in the 
mechanisin of radiation. Using the language of 
the theory of action at a distance, we give the 
idea the following definite formulation: 


(1) An accelerated point charge in otherwise charge-free 
space does not radiate electromagnetic energy. 

(2) The fields which act on a given particle arise only 
from other particles. 

(3) These fields are represented by one-half the retardecl 
plus one-half the advanced Liénard-Wiechert solutions of 
Maxwell’s equations. This law of force is symmetric with 
respect to past and future. In connection with this assump- 
tion we may recall an inconclusive but illuminating dis- 
cussion carried on by Ritz and Einstein in 1909, in which 
“Ritz treats the limilation 10 retarded potentials as one of 
the foundations of the second law of thermodynamics, 
while Einstein believes that the irreversibility of radiation 
depends exclusively on considerations of probability.’ 
Tetrode, himself, like Riiz, was willing to assume ele- 
mentary inleraclions which were not symmetric in time. 
However, complete reversibility is assumed here because 
it is an essential element in a unified theory of action ata 
distance. In proceeding on the basis of this symmetrical 
law of interaction, we shall be 1esting not only Tetrode’s 
idea of absorber reaction, but also Einstein’s view that the 
one-sidedness of the force of radiative reaction is a 
purely statistical phenoinenon. This point leads to our 
final assumption: 

(4) Sufficiently many particles are present to absorb 
completely the radiation given off by the source. 


On the basis of these assumptions we shall 
consider as the source of radiation an accelerated 
charge located in the absorbing system. A dis- 
turbance travels outward from the source. By it 

uW. Ritz and A. Einstein, Physik. Zeits. 10, 323 (1909); 


see also W. Ritz, Ann. d. Chemie et d. Physique 13, 145 
(1908). 
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each particle of the absorber is set in motion and 
caused to generate a field, half-advanced and 
half-retarded. The sum of the advanced effects 
of all particles of the absorber, evaluated in the 
neighborhood of the source, gives a field which 
we find to have the following properties: 

(1) It is independent of the properties of the absorbing 
medium. 


(2) It is completely determined by the motion of the 
source. 


(3) It exerts on the source a force which is finite, is 
simultaneous with the moment of acceleration, and is just 
sufficient in magnitude and direction to take away from 
the source the energy which later shows up in the sur- 
rounding particles, 

(4) It is equal in magnitude to one-half the retarded 
field minus one-half the advanced field generated by the 
accelerated charge. In other words, the absorber is the 
physical origin of Dirac’s radiation field. 

(5) This field combines with the half-retarded, half- 
advanced field of the source to give for the total disturbance 
the full retarded field which accords with experience. 


It will be sufficient to establish these results in 
order to have both in field theory and in the 
theory of action at a distance a solution of the 
problem of radiation, including an explanation 
of the force of radiative damping. 

We shall present four derivations of the reac- 
tion of radiation on the source of successively 
increasing generality. In the first we consider an 
absorber in which the particles are far from one 
another. We assume without proof that the 
disturbance which passes through the medium 
is the full retarded field of experience. In the 
second derivation we examine the field of the 
absorber in the neighborhood of the source and 
find it just such as to compensate the advanced 
field of the accelerated charge and to give a 
retarded field of the previously assumed magni- 
tude. In this case we have allowed the medium 
to have arbitrary density. The third derivation 
—in contrast to the first two, where the source 
was taken to be at rest or moving only slowly— 
considers the case of motion with arbitrary 
velocity and leads to the same relativistic ex- 
pression which Dirac has given for the force of 
radiative reaction. All three treatments proceed 
by adding up the fields owing to the individual 
particles of the absorber. A fourth derivation 
uses a much more general approach, assuming 
only that the medium is a complete absorber. 


ABSORBER THEORY OF RADIATION 


THE RADIATIVE REACTION: DERIVATION I 


For a first analysis of the mechanism of 
radiative reaction, we shall simplify as much as 
possible the properties of the absorber: 

(a) it is taken to be composed of free-chargerl particles; 

(b) these corpuscles are at rest or are moving only 
slowly with respect to the particle which we treat as the 
source; 

(c) the charged entities are well separated from one 
another; 

(d) the particles occupy space to distances sufficiently 
great to bring about essentially complete absorption of 
radiation from the source. 


We begin by considering the reaction set up 
between the source and a typical charge in the 
absorber when the particle of the source receives 
an acceleration WY, by collision with a third 
particle or otherwise. The source has a charge 
+e and, therefore, sends out an electromagnetic 
disturbance. This effect traverses the distance 
r, to the particle of the absorber, reaching it at 
a time (r,/c) seconds later than the instant of 
acceleration. For the electric field acting on the 
absorber at this place and time, we adopt the 
usual retarded solution of Maxwell’s equation, 
in conformity with experience, but without any 
attempt in this first derivation of the force of 
radiative reaction to reconcile such an assump- 
tion with the half-retarded, half-advanced field 
of the theory of action at a distance. At the 
distances in which we are interested, the retarded 
field of the source reduces to the well-known 
expression, 


— (eM /rxc?) sin (A, re), (1) 


together with a term of electrostatic origin. This 
second term falls off inversely as the square of 
the distance and may, therefore, be neglected. 
The electric vector lies in the plane defined by 
the directions of {& and r;, is perpendicular to rz, 
and is considered positive when its component 
along the direction of is positive. 

The typical particle of the absorber has a 
charge e, and mass m,. It will experience in the 
electric field of the disturbance an acceleration, 
Wi, equal to (e./m,) times expression (1). Its 
motion will generate a field which will be half- 
advanced and half-retarded. The advanced part 
of this field will exert on the source a force 
simultaneous with the original acceleration. The 
component of this reactive force along the 
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direction of the acceleration will be 
—e(e%M./2rxc?) sin (A, rx) 

= (Me?/2c*) (e,2/muri®) sin? (U, rx). (2) 


From expression (2) for the reactive force due 
to one particle of the absorber, we can evaluate 
the total effect due to many particles, present to 
the number N per unit of volume. The number 
of particles in a spherical shell of thickness dr, 
will be 4x%Nri2dr,. For the particles in this shell 
the average value of the geometrical factor 
sin? (2, r,) will be (2/3). Consequently we obtain 
for total force of reaction the integral of the 
expression 


(29€2 /3c8) (24 Ne? /myc) dre. (3) 


The force (3) gives an account of the phe- 
nomenon of radiative reaction which is not in 
accord with experience: 


(1) The force acts on the source in phase with ils 
acceleration; or in other words, it is proportional to the 
acceleration itself rather than to the time rate of change 
of acceleration. 


(2) The reaction depends upon the nature of the ab- 
sorbing particles. 

(3) The force appears at first sight to grow without 
limit as the number of particles or the thickness of the 
absorber is indefinitely increased. 


Nevertheless, proper addition of the effects due 
to all the particles of a complete absorber, with 
due allowance for their phase relations, does 
lead, as we shall see, to a reasonable expression 
for the reaction on the source. 

There exists a phase lag between outgoing 
disturbance and returning reaction which we 
have not taken into account. The advanced force 
acting on the source due to the motion of a 
typical particle of the absorber is an elementary 
interaction between two charges, propagated 
with the speed of light in vacuum. On the other 
hand, the disturbance which travels outward 
from the source and determines the motion of 
the particle in question is made up not only of 
the proper field of the originally accelerated 
charge, but also of the secondary fields generated 
in the material of the absorber. The elementary 
interactions are of course propagated with the 
speed of light; but the combined disturbance 
travels, as is well known from the theory of the 
refractive index, at a different speed, 


¢/ (refractive index) =c/n. 
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In order to speak of the change in velocity of 
the disturbance, or to treat the refractive index 
of the absorber in a well-defined way, it will be 
necessary to consider a single Fourier component 
of the acceleration. The connection between 
acceleraticn and reactive force being a linear 
one, it will be legitimate to decompose the 
acceleration into parts of this kind, and later 
to recompose the corresponding Fourier compo- 
nents of the radiative reaction. We shall, there- 
fore, suppose for the moment that the primary 
acceleration varies with time according to the 
formula 


A=Ay exp (—iwl), (4) 


where w represents the circular frequency of the 
motion. A disturbance of this frequency will 
experience in a medium of low density a refrac- 
tive index given by the familiar formula, 


n=1 — 27 Ne? /myw. (5) 


Thus the radiative reaction which reaches the 
source from a depth r, in the absorber will lag 
in phase behind the acceleration by the angle 

w(re/¢—nr,/c) = (27 Ne? /mcw). (6) 
We apply this phase correction to the contribu- 
tion (3) of absorber particles in the range r; to 
r.-+drx, and sum over all depths in the medium 
to obtain the total reactive force, 


(2eNe2/mic)dr. 
Xexp (—ir.2rNe?/micw). 


ae/seyat { 
; (7) 
This integral will converge at the upper limit 
when we allow for the existence of a small but 
finite coefficient of absorption in the medium. 
Or in the language of physical optics, so familiar 
from the writings of R. W. Wood, we can say 
that we have to determine the combined effect 
of a number of wave zones, alternately in and 
out of phase with the acceleration. The resultant 
force is 90° out of phase with the acceleration 
and is equal in magnitude to the arithmetic 
sum of the contributions from depths up to a 
point where the phase lag is one radian: 


(total reaction) = (2e?/3c3) (— iw) 
= (2e?/3c3) (dU /dt). (8) 


This result, derived by considering only a single 
Fourier component of the acceleration, no longer 
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contains explicit reference to the frequency of 
that component. Consequently expression (8) 
applies whatever is the dependence of acceler- 
ation upon time, so long as the velocities of all 
particles remain non-relativistic. In this respect 
we have a quite general derivation of the law of 
radiative reaction generally accepted as correct 
for a slowly moving particle subjected to an 
arbitrary acceleration. 

We conclude that the force of radiative reac- 
tion arises, not from the direct action of a 
particle upon itself, but from the advanced 
action upon this charge caused by the future 
motion of the particles of the absorber. 


RADIATIVE REACTION: DERIVATION II 


In the above treatment we considered first the 
retarded electromagnetic disturbance traveling 
outward in the absorbing medium; second, the 
motion of the particles of the medium due to this 
disturbance; third, the advanced part of the 
elementary fields produced by these motions; 
fourth, the sum of these fields at the position of 
the source. The same chain of reasoning will 
allow us to sum the elementary advanced fields 
of the particles of the absorbing medium at points 
in the neighborhood of the source. We shall find 
that this field is just sufficient, when added to 
the half-advanced, half-retarded field of the 
source itself, to give the usual full strength 
purely retarded field which one is accustomed to 
attribute to a radiating source. Thus we shall 
justify the assumption made in the first deriva- 
tion as to the strength of the outgoing disturb- 
ance. In order to make it clear that our reasoning 
is not circular, we shall represent the magnitude 
of the disturbance by a multiple, (?), of the 
usual full retarded field, and shall actually deduce 
the value unity for this at present undetermined 
factor. 

We shall now evaluate the contribution of 
particles in the absorber to the electric field 
acting in the region roundabout the source. In 
order to simplify the geometrical considerations 
as much as possible, we shall visualize the source 
as located at the center of a spherical cavity of 
radius R in the medium. We shall take the 
distance, 7, from the source to the point of 
evaluation of the field to be small in comparison 
with this radius. We shall however give up the 
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assumption that the particles of the absorber are 
necessarily free, or that they are far from one 
another. To make this generalization in our 
previous derivation, we shall express the acccler- 
ation of the typical particle of the absorber for 
a disturbance of circular frequency w in the form 


(electric field of disturbance) - (¢/mx)-P(w). (9) 


Here p(w) is in general a complex function of w 
which approaches unity only in the case of weak 
binding or high frequencies. The factor p(w), 
according to the theory of dispersion, determines 
the complex refractive index, n—ik, of the 
medium: 


1—(n—ik)?= (40Ne2/myo?) p(w). (10) 


The advanced field produced by the absorber 
at the distance from the source will be given in 
amplitude and phase by the product of the 
following factors: 


A=Ao exp (—t#), 
the acceleration of the source, here assumed for 
simplicity to be periodic, although this periodicity 
will drop out of the final result. 

—(e/rec?) sin (U, re), 
the factor by which the acceleration must be multiplied 
to obtain the strength of the full retarded electric field 
in vacuum at a great distance,,tx, from an accelerated 
particle of charge e. 

(?), 
factor as yet undetermined, which allows for the 
possibility that the disturbance which is propagated 
outward from the particle, and whicli is in general 
due only partly to the source itself, may differ in 
strength from the usual full retarded field. For an 
isolated charge in otherwise charge-free space this fac- 
tor is equal to (4). In the present case of a complete 
absorber we shall however later find for this factor the 
value unity. The product of the factors so far gives 
the strength of the electric field which would act on 
an isolated particle at the distance rx. 

exp (tor,/c), 
the phase of the disturbance which svould act on such 
an isolated particle. 

2(1t+n—ik)y-, 
factor by which the strength of the electric field of 
the disturbance inside the medium is reduced by 
reflection at the wal! of the cavity, a factor taken 
over from electromagnetic theory. 

exp (iw(n~—itk~1)(ri— R)/c), 
factor allowing for the change in phase and amplitude 
of the disturbance produced by propagation to the 
depth (r,—R) in the medium. 

(ex./mz)p(w) 
factor relating acceleration of absorber particle to 
electric field experienced by it. 
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— (ex/2r.c?) sin (A, rx), 
factor to be multiplied by acceleration of absorber 
particle to give the magnitude of the component of 
tle advanced electric field produced by the absorber 
in the neighborhood of, and parallel to the acceleration 
of, the source. 

exp (—twr;/c), 
factor allowing for the difference in phase between 
(a) the advanced field of the absorber as evaluated 
at the source itself and (b) the acceleration of thie 
typical absorber particle. 

exp (twr cos (r, rz)/c), 
correction to be applied to phase of absorber field at 
the source itself in order to evaluate this field at the 
distance, r, from the source. The product of the factors 
so far gives in magnitude and phase the advanced 
field at this point owing to a single particle of the 
absorber. 

Nr2ZdridQ, 

nuntber of absorber particles in the element of solid 

angle dQ and in the interval of distance drx. 


We evaluate the product of the listed factors 
and sum over all particles of the absorber to 
evaluate the total advanced field of the absorber 
jn the neighborhood of the source: 


(2) (e/c3)Mo exp (— iat) f exp (i(w/¢)r cos (1, ri:)) 
2 
Xsin? (WU, re) (d2Q/41) 


[ xNet/muc)p(o) (1-+n—i8)-dry 
: Xexp (io(n —ik—-1)(re—R)/c). (11) 


The last integral is simplified by the relationship 
(10) between refractive index and physical 
properties of the medium to an expression 
f (w*/e)(1—n-+ik)dn, 

R 
1) (ri (12) 


completely independent of the properties of the 
absorber. Having thus summed over al! particles 
lying in a given direction, we sum over different 
directions, using the relations 


(dQ/4m) =(1/2)d cos (r, rx)(de/2m), (13) 


where ¢ is the dihedral angle between the (r, 2) 
and (7, 7%) planes; and 


f sin? (QW, rx) (dy/27) 


= (2/3) f [1 Pa(cos Q, r»)) Ho/2x 
= (2/3)[1—P2(cos (, r))P2(cos (re, r)) J. 


Xexp (iw(n —tk R)/c) = —t, 


(14) 
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Also we note the integrals 


1 


(1/2) | ,exp (7% cos 6)d(cos 6) = Fo(x) 


1 for small u 


=u" sin wo ; : : 
(e —e-*) /2iu for all u, 


(15) 
and 
i 


exp (tu cos 0)P2(cos @)d(cos @) = F2(u) 


-1 
= (u-!— 3u-3) sin u+3u- cos u 


—u?/15 for small u 
=| (16) 


| (e—e-*) /2iu for large w. 

By use of these mathematical results, we find 

that the advanced field of the absorber has in 

the cavity an electric component parallel to the 

acceleration of the source which is given in 
magnitucle and phase by the expression 


(?)(2e/3¢)(—iw%o) exp (— tt) 
[Fo(wr/c) —Pa(cos (U, 7))Fe(wr/c)]. (17) 


The radiation field so obtained reduces at the 
source itself to the form 


(?)(2e/3c8) (dl /dt), (18) 


and at distances a number of waye-lengths from 
the source goes over into the expression 


— (?)(eWo/2rc?) exp (twr/c—iwt) 
+(?)(e%Mo/27c?) exp (—iwr/c—iwt). (19) 


In words, formula (19) states that the advanced 
field of the absorber is equal in the neighborhood 
of the accelerated particle to the still undeter- 
mined factor (?), multiplied by the difference 


field of source 
itself 


diverging from 
source 


(sive ing tom) (Fett source) field apparently diverging from source, 
7 5 ( ) ; 


We are now in a position to evaluate the unde- 
termined factor, (?), in the expression we have 
used in the above analysis for the force acting 
on the typical particle of the absorber. We have 
only to express al! three terms of Eq. (20) in 
units of the usual retarded solution of Maxwell's 
equations, a solution which asymptotically for 
large distances from the source gives for the 
electric field parallel to the acceleration the 
expression 


— (eW/re) sin? (QU, r). (21) 
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between half the retarded field (frst term) and 
half the advanced field (second term) which one 
calculates for the source itself. 

It is instructive to see how superposition of the 
advanced fields of a large number of particles 
can give the appearance of both retarded and 
advanced fields due to the source itself. The 
advanced field of a single charge of the absorber 
can be symbolized asa sphere which is converging 
towards the particle and which wil! collapse upon 
it at just the moment when it is disturbed by 
the source. But at the moment when the source 
particle itself was accelerated, the sphere in 
question had a substantial radius. One point on 
it touched, or nearly touched, the source. The 
shrinking sphere therefore appears to the source 
as a nearly plane wave which passes oyer it 
headed towards one of the particles of the 
absorber. When we consider the effect of all the 
absorbing charges, we-have to visualize an array 
of approximately plane wayes, all marching to- 
wards the source and passing over it in step. 
The resultant of these individual effects is a 
spherical wave, the envelope of the many nearly 
plane waves. The sphere converges, collapses on 
the source, and then pours out again as a 
divergent sphere. An observer in the neighbor- 
hood will gain the impression that this divergent 
waye originated from the source. 

A test particle will be unable to make a 
separation between the two retarded fields, one 
properly owing to the source, the other really 
owing to the advanced field of the absorber. 
Thus we have for the disturbance diverging 
from the source the relation 


actually composed of parts converging 
on individual absorber particles 


(20) 


To evaluate the third term in (20), we refer back 
to Eq. (19). Thus we find the algebraic equation 


(?) = (1/2)+(?/2), (22) 
of which the solution is 
(1) = (1/2)+ (1/2). (23) 


From our derivation we find for the disturbance 
diverging from an accelerated charge the full 
retarded field required by experience. 

Along the same lines we can find the strength 
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of the advanced field converging upon the source before the moment of acceleratjon: 


field of source 
itself 


converging on 
source 


(con “ting on) (Fe wouce ) field apparently convergent on source, 


At distances of several waye-lengths from the 
source, the two terms on the right possess simple 
mathematical expressions. Measured in terms of 
(21) as a unit of field strength, the right-hand 
side of (24) has the value (1/2) — (1/2) =0. We 
conclude that there is no net disturbance con- 
yerging upon the source prior to the time of 
acceleration. The advanced field of the source is 
completely compensated by the advanced field 
of the absorber. 

Our picture of the mechanism of radiation is 
seen to be self-consistent. Any particle on being 
accelerated generates a field which is half- 
advanced and half-retarded. From the source a 
disturbance travels outward into the surrounding 
absorbing medium and sets into motion all the 
constituent particles. They generate a field which 
is equal to half the retarded minus half the 
advanced field of the source. In this field we have 
the explanation of the radiation field assumed by 
Dirac, The radiation field combines with the field 
of the source itself to produce the usual retarded 
effects which we expect from observation, and 
such retarded effects only. The radiation field 
also acts on the source itself to procluce the force 
of radiative reaction. What we have said of one 
particle holds for every particle in a completely 
absorbing medium. All advanced fields are con- 
cealed by interference. Their effects show up 
directly only in the force of radiative reaction. 
Otherwise we appear to havea system of particles 
acting on each other via purely retarded forces. 


RADIATIVE REACTION: DERIVATION III 


So far the source has been assumed to be at 


actually composed of parts convergent 
on individual absorber particles 


(24) 


rest, or in slow motion, at the time of accelera- 
tion. The expression derived above for the force 
of radiative reaction is therefore liinited in its 
applicability. To obtain the corresponding law 
of damping for a swift particle three possibilities 
suggest themselves, each calling for a mathe- 
matical technique quite different from that of 
the other two. The first and simplest procedure 
is to look at the particle from a frame of reference 
moying with nearly its own speed, apply in this 
frame the expression which we already haye, 
and then transform back to the laboratory frame 
of reference. This application of the transforma- 
tion of Lorentz is perfectly legitimate but not 
especially instructive. 

A second method to calculate the force of 
reaction for a fast particle comes from Dirac. 
He makes the assumption that the damping 
arises from the action on the particle of a field 
equal to half the difference between the particle's 
own retarded and adyanced fields, a conception 
which we have now interpreted in terms of the 
radiative reaction of the absorber. -\s each of the 
two fields is individually singular at the location 
of the charge, evaluation of the difference re- 
quires one to apply a limiting process which 
presents a certain mathematical difficulty, though 
in principle perfectly straightforward. 

In connection with the limiting process of 
Dirac, it is interesting to refer back to the 
calculation of radiative reaction made by Lorentz 
on the model of an extended charge, every part 
of which exerted a retarded effect upon every 
other part. The elementary retarded field can be 
written in the form 


retarded |_| , f retarded 3 advanced ,f{ retarded |, f advanced | |. 
field J~|*\ field JT*\ field Jt] PY field J? feta 


Here the first term is singular and is related to 
what Lorentz called the electromagnetic mass of 
the particle. The second part, on the other hand, 
is the only one asymmetrical in time and capable 
of contributing to the force of radiative reaction. 
In present terms, the procedure of Lorentz 


amounts to an ingenious means to determine the 
limiting value of Dirac’s radiation field at the 
position of the source. Unfortunately the pro- 
cedure is not convenient to apply to a rapidly 
moving extended charge because of the relatiy- 
istic contraction of its spherical form. 
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The third procedure to evaluate the law of 
radiative damping for a swift particle is to calcu- 
late directly the reaction of the absorber on the 
source, along the lines of derivations I and I]. 
This approach uses the expression for the field 
of a radiating charge, not at small distances, 
where it is singular, but at large distances, where 
it has a simple asymptotic form. We shall explore 
this type of derivation because of its direct 
relation to the absorber theory of radiation. 

We idealize the absorber as before as a sphere 
of very large radius, 7, centered on the point 
reached by the given particle at a chosen instant. 
At the surface of the absorber, those constituents 
of the field which drop away as 1/7? will have 
become negligible in comparison with those which 
fall off as 1/r. The typical particle on this surface 
experiences an electric field perpendicular to the 
direction of 7. The magnitude of this field was 
represented in the case of a slowly moving source 
by an expression of the form, — (e/rc?)(compo- 
nent of acceleration perpendicular to 7), and is 
similarly representable in the present case by an 
expression of the type, — (e/r)(function of mo- 
tion)». Here the function in parenthesis, while 
more complicated than before, still depends only 
on the motion of the particle and the direction 
to the point of action. The influence of this 
disturbance causes the particles of the absorber 
to generate a field, the advanced part of which 
at the position of the source claims our attention. 
We consider the portion of this returned field 
arising from particles of the absorber which 
lie within an element of solid angle dQ. The elec- 
tric component of this field is perpendicular to 7 
and had for a slowly moying source the magni- 
tude (e/c)(—iw/c)(component of acceleration 
perpendicular to 7)(dQ/4m) when the acceleration 
was a periodic function of time, and more gener- 
ally was given by the derivative (e/c*)(d/cdt) 
(component of acceleration perpendicular to 7) 
x (dQ/4r). 

The relationship between returned field and 
original disturbance is a property only of the 
absorber and is independent of the state of 
motion of the source. Consequently, for the case 
of a particle moving at arbitrary velocity the 
returned electric field is perpendicular to 7 and 
equa! in magnitude to 


e(d/cdt)(function of motion) (d2/47). 
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What we have said of the electric field applies 
also to the magnetic field, because at great 
distances from an accelerated particle the two 
yectors have equal magnitudes and perpendicular 
directions. Thus we conclude that the reaction 
of the absorber on the source is described by a 
field, Fn, which is directly related to the retarded 
field, Rin, of the source at great distances, 7, by 
the equation® 


Fan = <1 { (ORan/x*)(49/4n), (25) 

The retarded field of the source particles, Rmn, 
in Eq. (25) is derived from the retarded po- 
tentials 


A,= 2¢ [ ar(a)a(xa,xar)da, 


through the equation 
Ran = (0A n/0x™) — (0A m/ Ox"). 


Here the integration over the proper cotime, a, 
goes only over that portion of the world line of 
particle a from which a retarded disturbance can 
reach the point of action, x”. The significant 
yalue of @ is connected with the coordinates x” 


12 Here and below we use the following notation: 
xl=x, 


os the three space coordinates of a typical point 
eon f evaluation of the field 
x=H5 of evaluation of the field. 
xt=—x,, a quantity also having the dimensions of a 


length, and given by the product of the 
velocity of light and the time elapsed between 
a certain zero hour and the moment of 
observation. 
a", similar space- lime Coordinates of a typical point 
on the world line of the ath particle. 
Successive points along the world line are designated by 
the values of a parameter, a, the proper cotime, which has 
the dimensions of a length and is equal to the product of 
the velocity of light and the proper time. The difference, 
da, in proper cotime between two neighboring points has 
the same sign as the difference da‘, and is given in magni- 
tude by the equation 


(dx)? = ¢?(time interval )*— (space interval)? = —da,dat. 


Derivatives with respect to a are denoted by dots. In 
comparing formulae given in this notation with those given 
elsewhere in the literature, it will be noted that some 
authors go from contravariant to covariant representation 
of a vettor by reversing the sign of its space components 
and leaving ils time component unaltered; also that dots 
are often used to indicate differentiation with respect to 
proper Ume, rather than proper cotime. In our notation 
the derivatives é" are dimensionless quantities which 
satisfy the relation dé," = — 1. We use xa” as an abbrevia- 
Uon for the vector, x"—a”. The usual scalar potential of 
the electromagnetic field is represented by the component 
A‘ of a four-vector, of which the other three parts, A!, A?, 
A%, constitute the space components of the customary 
vector potential. The typical component of the field is 
given in the equation Fyrin=(0An/dx™)— (@Am/dx"), where 
we have for the electric field E,= Fus= — Fa, etc., and for 
the magnetic field H, = F.3;= — Fy, etc. 
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by the equation 
xa,xak =, 


(26) 
The integration yields 
Am= —€Gm/ (Gyxa*). 


In differentiating this expression with respect 
to the coordinates of the point of observation, 
we have to allow for the associated change in 
the yalue of the proper cotime, given by the 
differential of (26), 


(dx, —G,da) (x* — a) =0, 
or 


da = (xa,dx*)/(d,xa"). (27) 


Thus the retarded field of a is found to be given 
by the expression 


Rian = €(GyX04)—? (GnXOn — GnXQm) 
+e(1+d,xa") (d,xa")3(—G,Xdn+GnXAy)- 


All terms in this expression fall off at large 
distances inversely as the first power of the 
separations xa, except for the terms arising from 
the unity in the factor ({+4,xa“), which we may 
henceforth omit. For the same reason in differ- 
entiating the field with respect to x‘, we may 
treat all differences xa” as constant. Thus we 
find in the limit of large distances 


=r f (@Rn/9x*)(d2/4n) 
=r f (—xas/axat)(d/da)Rrn( 10/4) 


— ef { 3 (d,xa") (a,xa")~*(xa nin — XAntim) 


— (d,xa")—3 (xa md, — X00) 
+ (G,xa") (G,xa")~4(XAmEn — XAG) 
—3(d,xa")?(a,xa")~5(xamGn —XGndm) } 


X (r°'dQ/4r). (28) 


As variables of integration it is convenient to 
use a colatitude 6 and azimuthal angle y, taking 
for polar axis the direction of the space compo- 
nent, (a, a, a3), of the four-vector, a”. With 
this choice of variables the denominator of the 
typical term in the preceding expression is a 
power of the factor (d,xa") =7r(d4+4 cos 6), where 
ay—d?=1. The absence of the azimuthal angle 
from the denominator and the relatively simple 
form of the numerator makes it easy to carry out 
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the integration over y. The numerator of the 
typical term then reduces to a polynomial in 
cos 6. The integration over 6 therefore leads only 
to algebraic functions of cos 6 to be evaluated at 
the two limits cos @= +1. The reduction of the 
resulting expressions to simple form requires 
rather long calculation. The final result for the 
field of radiative reaction at the location of the 
source Is 


jeer f (9Rmn/dx') (d2/41) 
; : (29) 
= (2¢/3) (Gmiin —GinGin). 

This expression for the field of the absorbing 
particles agrees with that given by Dirac for 
half the difference of retarded and advanced 
fields due to the source itself, provided account 
is taken of the difference between the present 
notation and his. 

lf we define the force of radiative reaction 
through its contribution to the product of the 
mass of the particle by its acceleration, mcd”, 
then we have for this force the expression 


eFinyit = (2€2/3) (mii, —ind,)a". (30) 


In the case of a slowly moying particle the first 
space component of this force is readily evaluated 
by noting that (1) dm is of the order of the ratio 
of the velocity of the corpuscle to the speed of 
light and is therefore negligible; (2) the quantity 
— G,@" has the value unity; and (3) the derivative 
dm represents (1/c) times the time rate of 
change of the given component of the accelera- 
tion, 2{. Consequently, the expression (30) re- 
duces in the non-relativistic limit to the usual 
formula, (2e?/3c) (dU /dt), for the damping force. 

From the properties of the retarded field at 
large distances from an accelerated particle in 
motion at an arbitrary velocity, we have obtained 
an expression for the force of radiative reaction 
previously derived by Dirac on the assumption 
that this force arises from half the difference of 
the advanced and retarded fields of the particle 
itself. It is, therefore, of interest to see that 
this equivalence can be demonstrated without 
going through the rather long calculations which 
are required on either method of derivation 
to obtain explicit expressions for the force of 
radiative reaction. To bring out the relationship 
between the two derivations, we go back to that 
expression for the retarded field of the source 
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which contains a delta function, and arrange the 
evaluation of Eq. (25) in such a way as always 
to keep a delta function in evidence. Thus we 
write the retarded field in the form 


Read { [—Gm(3/3x")-+4,(3/dx™) ]8(xa,x0")de. 


In order to postpone the differentiation of the 
delta function, we adopt an expedient to trans- 
form the variable of differentiation. We consider 
in addition to the actual world line of the source, 
a™(a), a displaced world line, a particle moving 
along which reaches at the proper cotime, a, the 
point d"(a)=a"(a)+D”, where the D™ are four 
numbers independent of a. We note that the 
derivative with respect to x” of any function of 
the differences x*—d* is equal to the negative of 
the derivative of the same function with respect 
to D”, Consequently we may write the expres- 
sion for the retarded field in the form 


Raa=2¢ f [én(3/8D")~ Gn(0/2D™)] 


X 6(xd“ed,)de, (31) 


where the result is to be evaluated in the limit 
when the displacements D™ go to zero. 

We now insert expression (31) for the retarded 
field into the integral for the field returned by 
the absorber, 


Past f (9Rmn/ID*) (dQ/4r), 
and encounter the integral 


Fran=2er(3/AD*) ) i} [dim(3/8D*) —cin(3/AD") J 
X 6 (xd, xG")da(dQ/4r), 


To bring out the meaning of this integral, we 
note that we want the radiative reaction on the 
source at a definite point, a*=a™(a*), along its 
world line; that this point is at the center of a 
sphere of radius 7; and that advanced disturb- 
ances from the particles on the inner surface of 
this sphere contribute to the force at this point 
only if they start at a cotime, x‘, equal to 
r+a‘(a*). Consequently, x4 has this fixed value 
as the integration over the surface of the sphere 
is carried out. Also during this integration we 
keep fixed the variable a and consequently hold 
constant @=d(a). Under these circumstances it 
is convenient to adopt for variable of integration 
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the angle @ between the space directions ad and 
ax: 
(dx)? = (ad)?+ (ax)? — 2 (ad) (ax) cos 6. 


Then we have 
f staxsaxn) (dQ/4m) = (1/2) “B(@aad eos @) 


(Gz) =(az) +(ad) Sha: 
= 5 (Gx,Gx") 
{ 


@x) =(az)— (aa) 


Xd| (dx)? — (ad)? — (ax)? ]/4 (ad) (ax) 
= f 5 (Gx,ax")d (dx,ax")/4 (ada)r. 


In this last expression the range of integration 
includes the point, dx,dx", for which the delta 
function gives a contribution, only if there are 
some points on the surface of the sphere which 
can be reached simultaneously by two retarded 
wayes which start out with a”(a*) and (a) as 
centers, This condition will be satisfied if and 
only if d(a) lies between the forward and back- 
ward light cones drawn with a(a*) as origin. 
Thus we have 


Fa=e(a/aD*) if dal Gm(3/8D") —dn(3/9D™)] 


. | 1/2(ad) when da,déa*>0 
0 when da,da" <0} 


The differentiation with respect to D‘ of the 
discontinuous function in the last pair of brackets 
gives a function which has the character of a 
delta function except for a change in sign at one 
of the singularities, Specifically, writing 


5(da,da*) = 6,4+5_, 


where 6, is different from zero only when a 
retarded disturbance from d(a) can reach the 
point a(a*), and é_ is different from zero only 
when an adyanced disturbance from d(a) can 
reach ¢he point a(a*), we have 


1/2(ad) when da,da*>0 


(@/eD9| = 04-8. 


0 when da,da* <0 
Then the field due to the absorber takes the form 


Fun = € f [am(0/dD*) — d,(/9D")](54.—8_)da 


= f [é,(3/a") —Gm(3/a") (84—8)de. (32) 
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In other words, the reactive field at the point 
a™=a™(a*) of the actual path is equal to half the 
retarded minus half the advanced field due to an 
equal charge moying on a world line of identical 
shape, all points of which are displaced by the 
amount D™, this field evaluated in the limit 
D0. This result establishes the connection 
between two different methods of evaluating the 
force of radiative reaction, one based on the 
properties of the retarded field of the source at 
great distances, the other containing half the 
difference of retarded and advanced fields at the 
location of the source itself. 


THE RADIATIVE REACTION: DERIVATION IV 


From the preceding applications of the ab- 
sorber theory of radiation, it has become clear 
that such properties of the absorber as refractive 
index and density have no bearing on the magni- 
tude of the force of radiative reaction. The only 
essential point is that the medium should be a 
complete absorber. We therefore expect that 
there should somehow be a means to take this 
point into account in a very general way. 

In physical terms, complete absorption implies 
that a test charge placed anywhere outside the 
absorbing medium will experience no disturbance. 


In mathematical terms, using Fe and le to 
denote the retarded and advanced fields due to 
the &th particle, we have 

&) ) 


De(4F ret +4 sav) =0 (outside the absorber). (33) 


From the fact that this sum vanishes outside the. 


absorber everywhere and at all times, it follows 
that each of the two sums also vanishes outside 
the absorber: 

(k) : 
dk Fret =0 (outside) 


and (34) 


Yo. Fery=0 (outside). (35) 


Thus, the one sum, if it does not vanish, repre- 
sents at large distances an outgoing wave, and 
the other represents a converging wave; but 
complete destructive interference between two 
such waves is impossible. Hence, if their sum 
vanishes, so does each field individually. From 
this conclusion it follows that the difference of 
the fields vanishes outside the absorber at all 
times: 
Co) (ky 


Yi P ret — $F sav) =0 (outside). (36) 
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The field (36), in contrast to the fields (33)- 
(35), has no singularities within the absorber; 
it is a solution of Maxwell's equations for free 
space, Vanishing outside the absorber at all 
times, it must therefore foreyer be zero inside. 
The special property of a completely absorbing 
medium is expressed by the equation 

Tal Fre (37) 

The consequences of Eq. (37) for the force on 
a typical particle are easily deduced. On the ath 
charge the entire field acting is given, according 
to the theory of action at a distance, by the sum 


=f, ae) =0 (everywhere). 


SGro+i). 


kya 


(38) 


This expression can be broken down into three 
parts: 

(k) (a) (a) (k) (k) 
xy Frett (3 Fret — 3 Faav) — > (Fret — 3 Fad): (39) 
ka all & 
Of these terms the third has just been shown to 
yanish for a complete absorber. The seconcl gives 
rise to the phenomenon of radiative damping. 
In the case of non-relativistic velocities we have 
the result 


ea($E rer — bE nav) = (2ee*/3c*)(dY%a/dt); (40) 


and in the case of swift particles we have for the 
force on the ath charge 
(a) @ \. 
eal4Faa ret — $F na adv) a. 
This expression reduces, according to Dirac, to 
the form 


(2€a?/3) (Gndia —G,Ga)a*, (41) 


in agreement with the reaction of the absorber 
as calculated in the preceding derivation. With 
this reactive term and the first term of (39), we 
arrive at the equation of motion of the typical 
particle in a completely absorbing medium 


, (k) : 
Man = Ca x Fra retQ* 
ka 


+ (2€q?/3)(Gniia—Gnda)d®. (42) 


In arriving at this equation we have shown that 
the half-advanced, half-retarded fields of the 
theory of action at a distance lead to a satis- 
factory account of the mechanism of radiative 
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reaction and to a description of the action of one 
particle on another in which no evidence of the 
advanced fields is apparent. We find in the case 
of an absorbing universe a complete equivalence 
between the theory of Schwarzschild and Fokker 
on the one hand and the usual formalism of 
clectrodynamics on the other. This is what was 
to be proved. 


THE IRREVERSIBILITY OF RADIATION 


An oscillating charge surrounded by an ab- 
sorbing medium loses energy. Why does radiation 
have this irreversible character even in a formu- 
lation of electrodynamics which is from the 
beginning symmetrical with respect to the inter- 
change of past and future? 

It might at first sight appear that the irre- 
versibility is connected with the property of 
complete absorption. This is not the case. The 
expression (37) of the condition of absorption is 
perfectly symmetrical between advanced and 
retarded fields. We have only to reverse thie roles 
of these two fields in the derivation following 
(37) in order to arrive at an equation of motion 
for the typical particle just as legitimate as 


(42), and in complete harmony with that 
equation: 
wo, 
Malin = Ca > Faw adva"™ 
ka 
— (2€.?/3) (GnGia —Ginda)a*. (43) 


In this equation, however, the force of radiative 
reaction appears with a sign just opposite to its 
usual one. Evidently the explanation of the one- 
sidedness of radiation is not purely a matter of 
electrodynamics. 

We have to conclude with Einstein!! that the 
irreversibility of the emission process is a phenom- 
enon of statistical mechanics connected with the 
asymmetry of the initial conditions with respect 
to time. In our example the particles of the 
absorber were either at rest or in random motion 
before the time at which the impulse was given 
to the sourct. It follows that in the equation of 


: k) 
motion (42) the sum, >> B. ret, Of the retarded 


fields of the adsorber parucles had no particular 
effect on the acceleration of the source. Con- 
sequently the normal term of radiative damping 
dominates the picture. In the reverse formulation 
(43) of the equation of motion, the sum of the 
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advanced fields of the absorber particles is not 
at all negligible, for they are put into motion by 
the source at just the right time to contribute to 
(k) 


thesum )> Fyaadv. This contribution, apart 


from dhe natural random effects of the changes 
of the absorber, has twice the magnitude of the 
usual damping term. The negative reactive force 
of (43) is therefore cancelled out, and a force of 
the expected sign and magnitude remains. 

That it is solely the nature of the initial 
conditions which governs the direction of the 
radiation process can be seen by imagining a 
reversal of the direction of time in the preceding 
example. We have then a solution of the equa- 
tions of motion just as consistent as the original 
solution. However, our interpretation of the 
solution is different. As the result of chaotic 
motion going on in the absorber, we see cach 
one of the particles receiving at the proper 
moment just the right impulse to generate a 
disturbance which converges upon the source at 
the precise instant when it is accelerated. The 
source receives energy and the particles of the 
absorber are left with diminished velocity. No 
clectrodynamic objection can be raised against 
this solution of the equations of motion. Small 
a priort probability of the given initial conditions 
provides our only basis on which to exclude such 
phenomena. 

A comparison of radiation with heat conduc- 
tion is illuminating. Both processes convert 
ordered into disordered motion although every 
elementary interaction involved is microscopi- 
cally reversible. 

Consider for the moment the question of the 
irreversibility of heat conduction, later to be put 
into relation with the problem of the one-sided- 
ness of radiation. A portion of matter observed 
at the present moment to be warmer than its 
surcoundings will cool off in the future with a 
probability overwhelmingly greater than the 
chance for it to grow hotter. About the past of 
the same portion of matter Boltzmann’s H- 
theorem however also predicts an enormously 
greater likelihood that the body warmed up to 
its present state rather than cooled down to it. 
In other words, we are asked to understand the 
present temperature of the body as the result of 
a simple statistical fluctuation in the distribution 
of energy through the entire system. This de- 
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TABLE I. Decomposition of the symmetric fields of the 
theory of action at a distance into the fields of the retarded 
field theory. 


Total field acting on ath particle 
in theory of action at a distance: 


(k) 
here dccomposed into: a 


= &) 
ab Fret ti Fady) 


(1) Retarded fields of usual nC) 
formulation of electrody-  ~ Fret 
namics. k fa 


(2) A field completely deter- 
mined by the motion of the 
particle itself; denoted as the 
“radiation field’ by Dirac; 
accounts for the nornial force 
of radiative reaction. 


(a) (a) 
C3 Fret— $Fadv mn - 
= (2¢a/3) (Aman —GnGn) 


(3) A residual field, with 
singularities at none of the 
particles, but completely de- 
termined by the motion of 
the particles; identified by 
us with Dirac’s ‘incident 
field."’ 


. (k) (ky 
D> (4Faav— 4F ret) 
all particles = Fine 


duction is based on the premise that the system 
was isolated before observation. However, coni- 
mon experience tells us that the given portion of 
matter probably acquired its abnormal tempera- 
ture, not via an internal statistical fluctuation, 
but because it had earlier not been isolated from 
the outside. 

For the radiative analogy of this example of 
heat conduction, conceive a charged particle 
bound to a position of equilibrium by a quasi- 
elastic force. Furthermore suppose its energy at 
the moment of observation is large in comparison 
with the agitation of the surrounding absorber 
particles. There is then an overwhelming proba- 
bility that the oscillator will lose energy to the 
absorber at a rate in close accord with the law of 
radiative damping. What can be said of the 
particle prior to the moment of acceleration? In 
an ideal absorbing system completely free of 
special disturbances, there is an equally over- 
whelming chance that the energy of the charge 
was then increasing at a rate given approximately 
by the inverse of the law of radiative damping. 
In this case as in heat conduction the abnormally 
high energy of the object is to be interpreted as 
the result of a statistical fluctuation. However, 
that the sun at some past age acquired its energy 
by such a fluctuation no one now would seriously 
propose. Obviously the universe is a special 
system with respect to the origin of which 
probability considerations cannot freely be 
applied. 
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We conclude that radiation and radiative 
damping come under the head, not of pure 
electrodynamics, but of statistical mechanics. 
Thé conventional expression for the force of 
radiative reaction, like those for frictional re- 
sistance and viscous drag, represents a statistical 
average only. Application of this concept is not 
required in such an instance as the case of 
complete thermodynamical equilibrium, where 
the relative fluctuations of the actual forces 
about the conventional values are substantial. 
The concept of radiative damping is of real 
value only when we deal with the conversion of 
organized into disorganized energy, as in wireless 
transmission or light production. 


COMPLETE AND INCOMPLETE ABSORPTION 


In the picture of radiation which we have 
built on the foundation of Tetrode’s suggestion, 
the absorber plays a role of hitherto unsuspected 
importance. On this account we should investi- 
gate not only how much the mechanism depends 
upon the completeness of the interception, but 
also the question what should be said of the 
absorption in the case of the actual universe. 

In discussing the case of incomplete intercep- 
tion, we require a convenient means to take into 
account the initial conditions which so clearly 
control the irreversibility of the force of radiative 
reaction. For this reason we shall break down 
the half-retarded, half-advanced fields of the 
theory of action at a distance into three parts as 
shown in Table 1. With this decomposition of 
the field, we arrive at a description of the 
behavior of a system of particles which is entirely 
equivalent to the theory of action at a distance 
but which in the equation of motion, 


tttdin = 0, XL Foe + (2€,2/3) 


k xa 


x (Ania — Gnd) +a Fina inc, (44) 


conceals from view the existence of the advanced 
part of the fields of Schwarzschild and Fokker. 
We shall find it convenient to use for the field 
decomposition of Table I and the dynamical 
Eq. (44) the term “retarded field formulation of 
electrodynamics.” 

The field which enters the third term in the 
equation of motion (44) vanished in the case of 
a completely absorbing system. Its appearance 
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in the present case has led us to give it the name 
of “incident field,” which Dirac applied to a 
quantity having an identical role in the equation 
of motion. However, on the origin of this field 
we go beyond Dirac’s treatment in giving a 
prescription for its unique determination in terms 
of the movements of all the particles of the 
system, This prescription reveals that the field 
in question contains the advanced effects of the 
theory of action at a distance. 

Some properties of the incident field may be 
noted before use is made of this concept in the 
analysis of special problems. The quantity Fine 
has a singularity at the site of none of the charged 
particles. Consequently it satisfies Maxwell’s 
equations for free space. Although completely 
determined by the motion of the charges, it thus 
has the character of a disturbance produced by 
sources at infinity. Now we already have in 


the retarded field, } aun aha a quantity 
whose behavior at all distances is likewise 
uniquely fixed by the motions of all the charges. 
Consequently we can expect to be able to deduce 
the incident field everywhere from a knowledge 
of the retarded field at large distances from the 
system of particles. Thus, in the determination 
of the incident fields we can, if we wish, avoid 
explicit reference to the movements of the 
charges, and base our considerations on the 
asymptotic behavior of their retarded fields 
alone. This point will be clearer after a considera- 
tion of a few examples, and can then be formu- 
lated in a general mathematical form as a by- 
product of an investigation primarily aimed at 
examining the problem of complete and incom- 
plete absorption. 

The simplest example will be the idealized 
case of a single-charged particle, alone in other- 
wise charge-free space, which is accelerated either 
by the gravitational attraction of a passing mass 
or by some other non-clectromagnetic force. For 
the three electromagnetic forces of the equation 
of motion (44) we then have the following 
accounting: (1) There are no other particles, so 
the retarded field of the first term vanishes. 
(2) The second term is different from zero and 
represents the conventional force of radiative 
reaction. (3) The incident field of the third term 
is in the present case equal to half the advanced 
field minus half the advanced field owing to the 
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particle itself. If we imagine the acceleration of 
the charge to be limited to a short stretch of 
time, then the incident field represents a dis- 
turbance which, long before the moment in 
question, was converging upon the particle from 
great distances. It focuses upon the particle at 
the period of acceleration and subsequently ap- 
pears as a wave diverging from the charge. This 
disturbance, apparently produced by sources at 
infinity, exerts on the particle a force which is 
just sufficient in magnitude and in sign to cancel 
the normal force of radiative reaction. The de- 
scription just given is the rather involved trans- 
lation into the language of the retarded field 
theory of the conclusion immediately apparent 
from the theory of action at a distance with its 
half-advanced, half-retarded fields; an isolated 
charge neither experiences a force of radiative 
reaction nor radiates away clectromagnetic 
energy. 

The incident field of the preceding problem 
could have been determined equally well without 
knowledge of the motion of the particle itself, by 
reference to the retarded field, Fret, of the charge 
at large distances. The latter quantity represents 
an electromagnetic disturbance which was negli- 
gible before the moment of acceleration, and 
which considerably later than that instant had 
the character of a diverging spherical wave. We 
can find a solution, S, of Maxwell’s equations 
for free space, the diverging wave in the asymp- 
totic expansion for which has exactly the same 
behavior as the field —43F yet. By this condition 
the solution in question is furthermore uniquely 
determined. On this account it must be identical 
with another field which also satisfies Maxwell’s 
equations for free space and has for its diverging 
wave at large distances the same form as — } Fret; 
namely, the incident field, Fine=4Faav—4Fret- 
Consequently we may write Fine=S, where S is 
the solution of Maxwell’s equations defined as 
above. From this means of arriving at the value 
of the incident field we conclude that the incident 
field is that solution of the wave equation for 
free space which, when added to the known 
retarded field, Fre, will reduce by one-half the 
strength of the diverging wave in the asymptotic 
representation of Fret. 

As next idealized example of incomplete ab- 
sorption we consider a source at the center of a 
blackbody with two opposed openings out into 
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Fic. 1. Advanced effects in two exaniples of an 
incompletely absorbing system. 


charge-free space (Fig. 1). In those directions 
not shielded by the absorber an incident field— 
to use the language of the retarded field theory 
—will enter, converge on the radiating particle, 
diverge and go out to infinity, as in the preceding 
example. This time, however, the wave incident 
from one side covers but the amount © of the 
whole solid angle of 47. Consequently at the 
instant when it is focused upon the source, it 
reduces the force of radiative reaction only 
fractionally below the conventional value of this 
force. The fraction in question is equal in the 
case of a pair of small opposed openings and a 
slowly moving source to the product of the 
following factors: 


Q/4n, 
the fraction of the whole solid angle spanned by one 
hole. 

2 


' 
factor allowing for the existence of the two openings. 

3 sin? (U1, r), 

factor allowing for the orientation of the holes relative 
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to the direction of acceleration. Here (W,r) is the 
angle between the vectorial acceleration of the source 
and the vector from this particle to one opening. The 
given expression for the polarization factor assumes 
that the quantities dM/dt and are parallel, and has 
to be replaced by a more complicated term when this 
parallelism does not exist. 


When the source is moving at the time of acceler- 
ation with a speed comparable to that of light, 
then the angular distribution of the radiation is 
represented by an expression more complicated 
than sin? (Qf, r) but the general principle is the 
same. 

In the present case of an absorber complete 
except for two inversion-symmetric openings to 
charge-free space we conclude that there is a 
continuous transition, as the size of the apertures 
is increased, from the full conventional force of 
radiative reaction on a central source, to the 
case of no radiative reaction at all. Furthermore, 
we note that a test charge placed in one of the 
two openings will receive a disturbance some 
time before as well as some time after the 
moment when the source itself is given its 
acceleration. Generally we may say that the 
explicit appearance of advanced effects is un- 
avoidable in the case of a system which is an 
incomplete absorber. However, in neither of the 
examples so far examined do advanced fields of 
the source produce explicit advanced effects on 
any other particle than a test charge: in the first 
example, because there is no other charged 
particle; and in the second case, because the 
incident field is restricted to a region of space 
where there are no particles to be disturbed, 
except a possible test particle. 

ADVANCED EFFECTS ASSOCIATED WITH 
INCOMPLETE ABSORPTION 


Recognizable advanced effects appear for the 
first time in a third example, a source at the 
center of a cavity completely absorbing except 
for a single passage to charge-free space. (See 
Fig. 1.) For simplicity we consider the source to 
be a slowly moving particle. Also we shall denote 
as “‘antipassage” that portion of the absorber 
which is marked out by the inversion, with 
respect to the source, of the passage itself. Apart 
from the fields which come from and go to the 
passage and antipassage, we have the usual 
solution of the problem of a completely absorbing 
system. In the language of the retarded formu- 
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lation of field theory, we can say that there is in 
all other directions no incident field, and that 
in those directions the absorber experiences only 
a normal full strength retarded field as a result 
of the acceleration of the source. In the language 
of the theory of action at a distance the result 
is the samie. The source gives out a half-advanced, 
half-retarded field, the advanced part of which 
in the direction of the normal portions of the 
absorber is cancelled by a portion of the advanced 
fields generated tn the absorber itself. The re- 
maining portion of this advanced field combines 
with the half-strength retarded field of the source 
to give the full retarded disturbance demanded 
by experience. The advanced field of the absorber 
at the location of the source itself produces a 
force of radiative reaction which is below the 
conventional amount in proportion only to the 
small solid angle which we have so far left out 
of consideration. So far the results are quite as 
expected. 

We have now to consider the effect on the 
antipassage of what, in the theory of action ata 
distance, ts the half-advanced field of the source. 
If the passage itself were filled with absorbent, 
this material would have generated a field, the 
advanced part of which would have compensated 
the effect we are now considering. As it is, we 
visualize two possible solutions of the problem 
of motion. In the first, the uncompensated half- 
advanced field of the source sets into motion 
ahead of time the particles on the inner face of 
the antipassage. In the second solution, this 
advanced field is compensated by a mechanism 
yet to be explained, and the particles in question 
are not disturbed before the moment of accelera- 
tion of the source. 

In the first solution, not illustrated in Fig. 1, 
the particles on the inner face of the antipassage 
spontaneously accelerate at a moment sufficiently 
early so that their retarded fields reach the source 
at just the moment when it is radiating. The 
retarded field of the charges of the antipassage, 
evaluated in the cavity, has in the present 
solution the following properties: (1) It vanishes 
except in the directions of the passage and anti- 
passage. (2) In those directions it has a value 
completely determined by the movement of the 
source. (3) It combines with the Schwarzschild- 
Fokker field of the source to cancel its outgoing 
component travelling in the direction of the 
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passage and to build up its advanced component 
on the side of the antipassage to a full strength 
advanced wave. The combined advanced wave 
has a magnitude exactly sufficient to account for 
the disturbance of particles of the antipassage 
ahead of time. (4) The field of these particles at 
the location of the source acts in the opposite 
sense from the conventional force of radiative 
reaction. The magnitude of that force is reduced 
by a fraction which, apart from a polarization 
factor, is equal to twice the solid angle subtended 
by the passage, divided by 47. 

The particles of the antipassage, in addition 
to the anticipatory movements already discussed, 
undergo, after the moment of acceleration of the 
source, a disturbance similar to that experienced 
by the charges which neighbor them on the inner 
face of the cavity. In this way they are caused 
to generate fields, the advanced part of which 
(1) combines with the half-retarded field of the 
source in the given direction to produce a full 
retarded disturbance that accounts for the mo- 
tions in question and (2) cancels the advanced 
field of the source in the direction of the passage. 
Thus neither advanced nor retarded disturbance 
emerges from this passage to be detected by an 
external test charge. 

The self-consistent solution which we have 
just described in terms of the symmetrical theory 
of action at a distance is easily summarized in 
the language of the retarded field theory. The 
sources of radiation are the central particle and 
the charges on the inner face of the antipassage. 
Each is considered to experience the conventional 
force of radiative reaction and to produce only 
retarded fields. The fields from the disturbed 
charges on the inner face of the cavity focus on 
the central charge at the moment of its accelera- 
tion, thus (1) partially compensating the con- 
ventional force of radiative reaction and (2) 
cancelling that part of its retarded field which 
is travelling in the direction of the passage. No 
retarded field gets outside the system. The inci- 
dent field, determinable as we have seen before 
from the asymptotic behavior of the retarded 
field, consequently vanishes. Thus in the given 
illustration the equation of motion of each 
particle contains only the retarded fields of all 
the other particles, plus the full conventional 
force of radiative damping, a conclusion con- 
sistent with the solution which we have just 
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given. To complete the picture, we have to 
express in terms of these equations of motion, 
the explanation of the early motion of the 
particles on the inner face of the antipassage. 
This movement we attribute to the influence of 
the retarded fields coming from other portions 
of the wall of the cavity, and reinforcing at just 
this particular region of the surface. The same 
type of reasoning can be followed back step by 
step in the past, along a course which is very 
like the reversal in time of the mechanism by 
which a burst of radiative energy dissipates 
itself. Granted that the existence of incomplete 
absorption requires, in our example, the explicit 
occurrence of advanced effects, we have in the 
given solution one reasonable picture how these 
advanced effects may build up until the time of 
disturbance of the source and may then be 
followed by a succession of retarded effects of 
magnitudes diminishing as absorption and re- 
flection at the inner walls of the cavity have 
their effects. 

In this third example an equally consistent 
solution of the problem of advanced effects may 
be briefly outlined (Fig. 1). The fields are in this 
case such that a test charge placed within the 
cavity experiences only the full retarded field of 
experience. The particles on the wall of the 
cavity are set into motion only after the time 
when the source was struck and caused to 
radiate. These particles, by the now familiar 
mechanism of absorber response, generate fields, 
the advanced parts of which in the cavity 
cancel the advanced field of the source and bring 
its retarded field up to full strength. In particular, 
the advanced field of the source in the direction 
of the passage is compensated, so that an external 
test charge on that side of the absorber will 
experience no advanced disturbance. It will, 
however, on our picture undergo a full strength 
retarded disturbance. How the half-retarded 
disturbance of the source in this direction is built 
up to full strength, and how the half-advanced 
field in the direction of the antipassage is can- 
celled, is a question still to be cleared up. For 
explanation we cannot (1) call upon the advanced 
fields of the absorber in the direction of the 
passage, for there is no matter in this direction. 
Nor can we (2) call upon retarded fields of 
particles on the inner face of the antipassage for 
our purpose—although they lie in just the right 
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direction thus to produce the field in question— 
because they have been assumed not to have 
been set in motion prior to the time of accelera- 
tion of the source. To account for the directional 
field of so far unknown origin in the cavity, we 
are by (1) restricted to an explanation in terms 
of a retarded wave from some source yet to be 
found, lying in the direction of the antipassage 
and by (2) this source cannot consist of the 
particles on the inner face of that portion of the 
absorber. Consequently we must interpret the 
field in question as owing to particles set in 
motion ahead of time on the outer face of the 
antipassage. This conclusion, like many of the 
considerations to which we are led in the study 
of incompletely absorbing systems, appears para- 
doxical. Nevertheless it leads to a self-consistent 
solution of our problem. In the first place, the 
half-retarded field of the surface particles com- 
pensates within the antipassage as well as within 
the cavity the half-advanced field of the source. 
There is, therefore, no question of the propaga- 
tion of any disturbance through the thickness of 
the absorber. Secondly, the half-advanced field 
of the surface particles appears from the point 
of view of an external test particle to be a wave 
front of limited cross section which comes from 
outer space and which would converge upon the 
source if the antipassage did not block its path. 
This half-advanced field in the region exterior to 
the absorber adds to the half-advanced field 
from the central source to give a full strength 
disturbance convergent upon the surface particles 
in question. Thus an account is given for the 
force and for the acceleration which they experi- 
ence ahead of the time of acceleration of the 
source within the cavity. The energy which is 
absorbed at the outer surface is later paid out by 
the source. That accelerated particle experiences 
the full conventional force of radiative reaction. 

The solution just given is readily translated 
into the language of the retarded formulation of 
field theory, where the force on each particle is 
attributed to three sources: the conventional 
force of damping, the retarded fields of all other 
particles and an incident field. The incident field 
has the appearance of a disturbance convergent 
from outer space upon the external face of the 
antipassage. By it the particles are accelerated 
and caused to generate a full strength retarded 
field which at greater depths within the medium 


54 
176 fi 


cancels the incident field, a phenomenon which 
is the normal mechanism of absorption. Thus 
there is within the cavity no disturbance con- 
vergent upon the source, and consequently only 
the usual retarded effects are observed, part of 
which also is propagated out through the passage 
into charge-free space. 

In the retarded formulation of field theory, 
there is no apparent reason for a correlation 
between the surface absorption and the radiation 
process within the cavity. The requirement for 
a connection between the two comes into evi- 
dence only in the condition which must be 
satisfied by the incident field, and which has been 
discussed above. That there must be such a field 
follows from the existence of a full strength 
retarded fiéld, R, diverging outward from the 
source through the passage in the absorber. 
Therefore, denote the strength of the incident 
field in this region of space and at this instant of 
time by 7R, where z is a factor now to be found. 
Being also divergent from the source, but free of 
singularities there, the incident field must in this 
neighborhood and in the given cone of directions 
be a multiple of the radiation field of the source. 
On being followed backwards in time to moments 
previous to the acceleration of the particle, it 
must, therefore, have in the direction of the 
antipassage the magnitude —7A, where A de- 
notes the advanced solution of Maxwell’s equa- 
tion for the accelerated source. Thus the field 
incident from great distances upon the particles 
on the outer surface of the antipassage must have 
the magnitude —nA. These particles generate a 
retarded field which within the absorber compen- 
sates the incident field and therefore has the 
magnitude +7A. This field, followed onward in 
the direction of the original source, where it 
naturally has no singularity, at first converges 
and then diverges to give the appearance of a 
retarded wave from the source itself. In this 
neighborhood the retarded field of the surface 
particles behaves much as does the radiation 
field of the source. Consequently the strength of 
the field in question, evaluated in the direction 
of the passage, is —nR. Thus the sum of the 
retarded fields of all the particles of the system, 
evaluated outside the passage way, is R (from 
the source) —2R (from the surface particles). 
To determine the strength of the incident field, 
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we now apply the condition that the divergent 
term, 7#R, in its asymptotic representation must 
have a strength equal to —4 times that of the 
divergent wave owing to the retarded fields of 
all the particles of the medium: 


nR=—i(R—nR). (45) 


The solution of this implicit equation gives for 
the magnitude of the incident wave in the direc- 
tion of the passage »R=—R, and consequently 
for the strength of the incident wave converging 
upon the other side of the system +A. Thus we 
check the properties of this second solution of 
our problem as obtained previously by using 
the language of half-advanced, half-retarded 
fields, with no reference to the concept of 
incident field. 

Between the two self-consistent solutions of 
this third example of an incomplete absorber we 
make no attempt to choose. We have to accept 
the fact that the dynamical system in question 
possesses a number of degrees of freedom which 
is in direct proportion to the number of particles 
present. Once it is granted that advanced effects 
of some kind must be connected with the acceler- 
ation of the source, it does not follow uniquely 
upon which particles these advanced effects must 
act. The selection isa matter of initial conditions, 
not of equations of motion. The two solutions so 
far described are only two relatively simple 
samples from an infinite number of possible 
solutions, distinguished from one another by the 
requirements put upon the initial state of the 
particles of the absorber. It is only in the case 
of a completely absorbing system that there is 
the possibility to find a set of initial conditions 
which is relatively well determined by statistical 
considerations. 

Self-consistency being the only requirement 
which has to be met by a solution of the problem 
of an incomplete absorber, and this requirement 
in the retarded field formulation being largely 
contained in a condition to be satished by the 
incident field, it may be of interest to have our 
so far informal statement of this relation put 
into mathematical terms. The condition in ques- 
tion furnishes a connection between the incident 
field, here abbreviated as J, and the sum, R, of 
the retarded fields of all the particles. 
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To derive the desired relation, we note that 
the present formulation of electromagnetic 
theory expresses the incident field as half the 
difference between the advanced and retarded 
fields owing to all the particles. Thus the ad- 
vanced field of the system is given by the 
expression R+27. From (1) a knowledge of this 
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advanced field everywhere in space and (2) a 
knowledge of the values taken on the points, &, 
of a surface surrounding the system by an arbi- 
trary solution, S, of Maxwell's equations for the 
same charge distribution, we can derive the 
values of this arbitrary solution at all other 
points in space, x, from the relation 


S@= (RC) +2160) + 2) f f f detdeids"(00S/a%! ~ 00/08) |" + eat 


In Eq. (46) the symbol 6 stands for the delta 
function, 6(x*— &-x,—£,). The integral is to be 
taken only over the immediate neighborhood of 
those points on the surface from which an ad- 
vanced wave can reach the point x!, x’, x3 at the 
time x‘. In the first integrand we have the dif- 
ference between the values of a certain quantity 
calculated for the largest and smallest values of 
£! consistent with given values of &, &, &; and 
similarly for the other three integrals. We shall 
write Eq. (46) symbolically in the form 


S=(R+2D) +Adv. [S]. (47) 


We now apply this general relation to the 
special half-advanced, half-retarded solution of 
Maxwell’s equations for the system of charges, 
S=R-+4T, In this way we arrive at an implicit 
equation by means of which to derive the inci- 
dent field from the retarded field: 


0=I+ Adv. [R+/]. (48) 


This relation is the generalization of Eq. (45) 
from which we determined the strength of the 
incident field in the third example above. With 
this generalization we end our study of the 
behavior of idealized systems with incomplete 
absorption and come to the wider question what 
we should say about the absorbing properties of 
the system with which we have to deal in nature. 

There would be no problem in interpreting 

8 The analogue of Eq. (46), for determination of the 
arbitrary solution from a knowledge of the retarded 
solution, has been given in a rather different form by W. 


R. Morgans, Phil. Mag. 9, 148 (1930). The present form 
is most easily derived by use of the relation 


(8/8#)(8/dky)3(x” — £"-x,— by) 
= — 408 (xt ~ £15 (0? (x? — B)S(x — E), 
and application of Green’s theorem in four dimensions. 


~ ff favavaetasjoe—sasjoey|'"|- (46) 


the universe as a completely absorbing system 
if it were an indefinitely extended Euclidean 
space. The existence of the electron-positron 
field gives an mechanism by which, even in a 
vacuum, radiation of some frequencies can 
undergo absorption processes, and light of all 
wave-lengths can be scattered. These processes 
are sufficient ultimately to degrade all the 
radiation given out by an accelerated cliarge. 

The universe is however now generally re- 
garded as a closed space, in harmony with the 
illuminating theory put forward by Einstein. In 
this space present observations suggest that the 
absorption of radiation is far from complete even 
at the greatest depths so far plumbed, of the 
order of one-tenth the calculated radius of the 
universe. If this conclusion is correct, then a 
complete clectrodynamic description of the 
mechanism of radiation would require us to take 
into account not only the curvature of space but 
also the phenomena stmmarized under the term 
“expanding universe.” At the present time we 
know too little about these matters to carry out 
such a complete description. Moreover, there is 
yet no compelling reason to attempt this de- 
scription. We know of course that electro- 
dynamics remains, in other respects as well, to 
be tied to gravitational phenomena. But we 
recognize that in this sense our present theory 
of electrodynamics, like the theories in all other 
parts of science, is an idealization. 

So long as we limit ourselves to the idealization 
based on the concept of a Euclidean space, we 
have to consider the question of complete and 
incomplete absorption on a purely empirical 
basis. In this connection we will obtain a satis- 
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factory account of experience, as we have seen, 
on the assumption that the universe behaves as 
a completely absorbing system. 


PRE-ACCELERATION 


Is there in the case of a completely absorbing 
universe any consequence of the act of radiation 
which is so apparently paradoxical as the obvi- 
ously advanced effects encountered in the in- 
stance of an incompletely absorbing system? If 
so, what words can we reasonably use to assimi- 
late such a phenomenon into our experience? 

Any advanced effects to appear in the case of 
a completely absorbing system must be deducible 
from the conventional force of radiative reaction, 
for the only other electrodynamical effects ap- 
pearing in this case in the equation of motion 
(42) of the typical particle are the retarded fields 
of all other particles. That the damping term 
does lead to an advanced effect follows from an 
interesting example already considered by Dirac.® 
A source sends a sharp pulse of radiation towards 
a particle of charge e and mass m. At the instant 
of arrival the speed of the particle would be 
expected abruptly to increase if the force of 
damping were proportional to the first derivative 
of the displacement. Actually the radiative re- 
sistance is proportional to the third derivative 
of the displacement, and the nature of the solu- 
tion of the equation of motion is changed. The 
particle commences to move before the time of 
arrival of the pulse; and e2/mc? seconds ahead of 
time it attains a velocity comparable with its 
final speed. 

As a Suitable way to speak of this most inter- 
esting phenomenon of pre-acceleration brought 
to light by him, Dirac suggests saying that “‘it is 
possible for a signal to be transmitted faster 
than light through the interior of an electron. 
The finite size of the electron now reappears in 
a new sense, the interior of the electron being a 
region of failure, not of the field equations of 
electromagnetic theory, but of some of the ele- 
mentary properties of space-time.”” This choice 
of language is perhaps suitable in certait respects 
to describe the pre-acceleration of the single 
charge in the example considered by Dirac. It 
may also be of value in other special instances. 
However, the given mode of speaking suggests 
in the case of a medium of closely packed charges 
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the possibility of transmission of signals with a 
speed greater than that of light over microscopic 
distances, a conclusion which appears to be 
denied by a direct investigation of the point. 
Also the idea that the properties of space time 
fail in a region of the order of e?/me? around a 
charge appear to have possibilities of suggesting 
misleading conclusions sufficiently great to call 
for a later search for a more suitable means of 
expression. 

We shall now attempt to test the idea sug- 
gested by the term “‘speed greater than that of 
light” that the phenomenon of pre-acceleration 
might be cumulative when charges are spaced at 
a distance from one another comparable to the 
quantity e?/mc?. The method of analysis will be 
very nearly that followed by Sommerfeld!* and 
Brillouin’? in their classic resolution of the ques- 
tion how it can be that the speed of propagation 
of a disturbance in a dispersive medium never 
exceeds the velocity of light even when the phase 
velocity for certain frequencies is far above this 
upper limit. The only significant mathematical 
difference between the two cases is the change of 
the damping force from proportionality to the 
first power of the frequency to proportionality 
to the third power. 

The first step in the procedure of Sommerfeld 
and Brillouin is to determine the refractive index 
of the medium, , as a function of frequency. 
The charges of the material are assumed normally 
to be at rest. Consequently the magnetic per- 
meability is unity. According to the standard 
result of electromagnetic theory the square of 
the refractive index is in this case equal to the 
dielectric constant: 


(electric field in a thin slot 


‘ cut normal to the field) 


~ (electric field in a thin cavity 
cut parallel to the field) 


=(E4+49rP)/E=1+(40P/E). 


(49) 


Here P represents the polarization of the 
medium: 


P=(number of charges per unit volume) - 
(charge of each)- 
(displacement from equilibrium). 


4 A. Sommerfeld, Ann. d. Physik 44, 177 (1914). 
% L. Brillouin, Ann. d. Physik 44, 203 (1914). 


(50) 
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For the force which determines the displacement 
of the charges in a homogeneous isotropic 
medium it is reasonable, according to Lorentz 
and Lorenz, to take the result valid for a cavity 
of spherical! form, 


(force) = (charge per particle)(Z-+ (4%P/3)). (51) 


The displacement itself is related to the force by 
the equation of motion, 


(force) = m(displacement) 
+(a constant) - (displacement) 


~ (2e?/3c3)(displacément). (52) 


Here we have visualized in the second term of 
the right side the possibility that the particles 
are bound to equilibrium positions by elastic 
forces. Without such forces we should be led by 
Earnshaw’s theorem to expect that the medium 
would form a dynamically unstable system. We 
now follow Lorentz and Lorenz in selecting a 
function of refractive index which is easy to 
evaluate as a function of frequency: 


3(n?—1)/(n?-+2)=4"P/LE+ (4nP/3)] 
_ 4n(particle density) (charge) ?(displacement) 


m/(disp) + (const) (disp )«~(2e?/3c) (disp) 53) 
We consider a monochromatic disturbance of 
circular frequency (3mc*/2e*)w, and express the 
elapsed time as (2e?/3mc*)r, where both 7 and w 
are dimensionless quantities. We assume that 
the displacement of the typical particle varies 
with time as exp (—1twr). Also we express the 
number of particles per unit volume in the form 
(N/3x)(3me?/2e*)?, where N is also a magnitude 
without dimensions. Then from (53) we obtain 
the refractive index as a function of frequency in 
the form: 


n(w) =[1+2N/(wo?— w*?— iw’) }. (54) 
Here we have introduced the abbreviation 
ao? = (2e?/3mc?)*(constant/m) —(2N/3), (55) 


where wo is a measure of the exteiit to which the 
assumed quasi-elastic force over-compensates the 
otherwise inherent electrical instability of the 
system. 

The propagation through a vacuum of an 
electrical disturbance of circular frequency 
(3mc3/2e*)w is conveniently described by an elec- 
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trical field of the form 
exp (taf—twr), (56) 


when we use the quantity (2e?/3mec*)é as a meas- 
ure of distance in the direction of propagation. 
We suppose this disturbance to be incident on a 
medium occupying the infinite half-space from 
£=0 to =+ 0. Then the transverse electric 
field of the monochromatic wave will be repre- 
sented in the medium by the expression 


2(n+1)-! exp (twnt—iwr). (57) 


As a measure of the disturbance in the medium 
we shall take the displacement of the typical 
particle or, what is up to a constant the same 
thing, the polarization: 


P=(n?—1)E/4n, 


= (21r)-"(n—1) exp (tont—iwr). (58) 


We are interested in following the progress 
through the medium, not of a monochromatic 
wave, but of an initially well defined pulse. We 
shall idealize the incoming electric ficld as a 
delta function, 6(—7), with the property 6(2) =0 
when 


+2 
ux, f 6(u)du=1. 


We recall the representation of the delta function 
as a Superposition of monochromatic waves: 


ie) = (ny [exp (iagier)dw. (59) 


This expression will represent the electric field 
in the vacuum. In the medium the electric 
polarization will accordingly be given as a fine- 
tion of position and time by the integral 


P(t, 1)=(2n)"? f * (n(o)—1) 


Xexp (iwnt—iwr)dw, (60) 


where the refractive index is obtained as a 
function of frequency from (54). 

Of the mathematical details of evaluating the 
polarization of the medium from Eq. (60) it is 
enough to say that it is convenient to displace 
the path of integration in the complex plane, and 
to apply the familiar saddle point method of 
approximation. This procedure is sufficiently 
accurate for our purpose when we accept the 
following reasonable conditions: 
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(1) We consider a medium of macroscopic dimensions. 
With the quantity (2e?/3mc?) equal to 1.88107 cm, it 
follows that the values of the quantities — and + are of the 
order of magnitude of 10748, 

(2) The number, N, of particles per volume element 
3m (2e?/3mc?)’ is of the order of, or greater than, unity. 

(3) At the depth, &(2e?/3mmc?), in the medium, the dis- 
turbance, if propagated with the speed of light in vacuum, 
would arrive at that time, 7(2e?/3mc?), for which r=€; or 
more briefly, we shall say that the value of the ‘‘light- 
instant,’ 7, at the depth & is given by the equation 7=€. 
We limit our interest to the disturbance at times ahead of 
the light-instant by an amount which, expressed in the 
dimensionless neasure, §—7, is small in comparison with 
E~10'8, although this difference may otherwise range all 
the way from a value very small in comparison with unity 
to a value as great as several orders of magnitude of 10. 

(4) The dimensionless measure of natural frequency of 
oscillation of the system, we, in order of magnitude is not 
large in comparison with unity. 


Under these conditions we obtain an approximate 
representation of the polarization of the medium 
in the form 


4n°P = (nN/8))[(1/2N8)— (—1) +--+], 


for values of (E—7) in a range of order 1/(NE)! 
on either side of the light-instant; and 


4? P= (4rN/3E)'((E—7)/2NE} 
exp [— (3/4)(2NE/E—7)(E—7)] 
cos [(4/3)-+ (34/4)(2NE/E—7)(E—7)], (62) 


for values of §~7 between the rough limits 
1/(NE)! on the little side and some small fraction 
of the quantity NE on the big side. 

We obtain from expressions (61) and (62) the 
following picture of the displacement of the 
charges of the medium at the depth £(2e?/3mc?) 
before and at the light-instant: 


(61) 


(1) The typical particle receives a displacement before 
the light-instant, thus justifying the use of the descriptive 
term “pre-acceleration’’ even in the case of a medium 
containing many particles. 

(2) The displacement of the typical particle, instead of 
increasing with time according to the simple exponential 
law, exp (r—£), derived by Dirac for an isolated particle, 
is here before the light-instant an oscillatory function of 
time of much more rapidly increasing amplitude. 

(3) The last full oscillation before the light-instant is in 
the negative sense, that is, opposite to the direction of the 
field in the original pulse. Thts oscillation is completed 
only slightly before the liglt-instant, so at that time the 
displacement of the typical particle is positive but small 
in comparison with its magnitude in the last few preceding 
vibrations. However, the velocity of the particle at a time 
about equal to the light-instant has reached the maximum 
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value so far experienced. The condition of approximately 
zero displacement and high velocity has a certain corre- 
spondence with the result which would be expected at the 
time of arrival of the disturbance in the absence of the 
phenomenon of pre-acceleration. 

(4) The characteristic time of pre-acceleration may 
reasonably be taken to be measured by the interval 
between the last two nodes of the oscillation, a quantity 
which has the order of magnitude (N€)7}(2e?/3mc?), a very 
small fraction of the so-called classical radius of the 
charged particle. Another estimate for the time of pri- 
acceleration of the same order of magnitude is obtained 
by studying the exponentially increasing envelope of the 
oscillatory motion described by Eq. (60). 


From the tentative conception that the classical 
radius of a charged particle defines a region 
within which disturbances are propagated with 
a speed faster than the velocity of light, it would 
have appeared reasonable to expect in a very 
dense medium a macroscopic velocity of propa- 
gation significantly greater than the normal 
limiting value. If this were the case, the interval 
of pre-acceleration, §—7, would have increased in 
proportion to the depth, &, and would have been 
appreciable in comparison with & In contrast, 
we have now found that the characteristic time 
of pre-acceleration not only decreases slowly with 
depth in a dense medium, but also is an exceed- 
ingly smal! fraction of the value obtained by 
Dirac for the case of a single particle. We con- 
clude that it is misleading to attribute the phe- 
nomenon of pre-acceleration to an abnormal 
velocity of light or to a failure of the usual 
conceptions of space-time in the immediate 
neighborhood of a charged particle. We are 
therefore obliged to look to other terms for a 
suitable way to describe the phenomenon. 


PRE-ACCELERATION AS WITNESS TO THE 
INTERACTION OF PAST AND FUTURE 


Pre-acceleration and the force of radiative re- 
action which calls it forth are both departures 
from that view of nature for which one once 
hoped, imwhich the movement of a particle at a 
given instant would be completely determined 
by the motions of all other particles at earlier 
moments. All thought was excluded of a de- 
pendence of the force experienced by the particle 
upon the future behavior of either that charge 
itself or any other charges. The past was con- 
sidered to be completely independent of the 
future. This idealization is no longer valid when 
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we have a particle commencing to move in 
anticipation of the retarded fields which have yet 
to reach it from surrounding charges. Still less is 
it a good approximation to the truth in the case 
of an incompletely absorbing system, where we 
have in addition to the normal damping force an 
incident field seen above to depend explicitly 
upon the advanced fields of the individual 
particles, and where we encounter advanced 
effects even more striking than preacceleration. 

The mechanism by which the future affects 
the past is illuminated by considering a system of 
three or more charges in the light of the half- 
advanced, half-retarded fields of the theory of 
action at a distance. The retarded field produced 
by the acceleration of a affects b; the advanced 
field of b sets c in motion; and c generates a 
field, the advanced part of which affects a before 
the moment of its acceleration. By an extension 
of this line of reasoning it is apparent that the 
past and future of all particles are tied together 
by a maze of interconnections. The happenings 
in neither division of time can be considered to 
be independent of those in the other. Neverthe- 
less, in a system containing particles sufficient 
in number to provide effective absorption, an 
interference takes place between these forces. 
All the advanced effects are cancelled out except 
those which are comprised in the conventional 
force of radiative reaction; and these are limited 
in their influence to a time of the order of 
magnitude of the quantity (e?/mc3). Therefore, 
to the extent that the force of radiative reaction 
can be neglected, we have in the case of a 
completely absorbing system the possibility to 
make a cut between past and future; but the 
cleanness of this cut is limited to times of the 
order of e?/mc*? or greater. Those phenomena 
which take place in times shorter than this figure 
require us to recognize the complete interde- 
pendence of past and future in nature, an inter- 
dependence due to an elementary law of inter- 
action between particles which is perfectly 
symmetrical between advanced and retarded 
fields. 

SUMMARY 


Use of action at a distance with field theory as 
equivalent and complementary tools for the 
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description of nature has so far been prevented 
by inability of the first point of view fully to 
account for the mechanism of radiation. Eluci- 
dation of this process in both theories comes 
from a 23-year old suggestion of Tctrode, that 
the absorber may be an essential element of the 
act of emission. A quantitative formulation of 
this idea is given here on the basis of the following 
postulates: (1) An accelerated charge in other- 
wise charge-free space does not radiate energy. 
(2) The fields which act on a given particle arise 
only from other particles. (3) These fields are 
represented by one-half the retarded plus one- 
half the advanced Lienard-Wiechert solutions 
of Maxwell's equations. 

In a system containing particles sufficient in 
number ultimately to absorb all radiation, the 
absorber reacts upon an accelerated charge with 
a field, the advanced part of which, evaluated 
in the neighborhood of the source on the basis of 
these postulates, is found to have the following 
properties: (1) It is independent of the properties 
of the absorbing medium. (2) It is completely 
determined by the motion of the source. (3) It 
exerts on the source a force which is finite, is 
simultaneous with the moment of acceleration, 
and is just sufficient in magnitude and direction 
to take away from the source the energy which 
the act of radiation imparts to the surrounding 
particles. (4) It is equal in magnitude to one-half 
the retarded field minus one-half the advanced 
field of the accelerated charge itself, just the 
field postulated by Dirac as the source of the 
force of radiative reaction. (5) This field compen- 
sates the half-advanced field of the source and 
combines with its half-retarded field to produce 
the full retarded disturbance which is required by 
experience. Radiation is concluded to be a phe- 
nomenon as much of statistical mechanics as of 
pure electrodynamics. A complete correspondence 
is established between action at a distance and 
the usual formulation of field theory in the case 
of a completely absorbing system. In such a 
system the phenomenon of pre-acceleration ap- 
pears as the sole evidence of the advanced effects 
of the theory of action at a distance. Other 
advanced effects appear in the case of an incom- 
pletely absorbing system and are also discussed. 


60 


REVIEWS OF MODERN PHYSICS 


VOLUME 21, 


NUMBER 3 JULY, 1949 


Classical Electrodynamics in Terms of Direct 
Interparticle Action’ 


JoHN ARCHIBALD WHEELER AND RICHARD PHILLIPS FEYNMAN? 
Princeton University, Princeton, New Jersey 


“ 


. . . the energy tensor can be regarded only as a provisional means of representing matter. 


In reality, matter consists of electrically charged particles. . . 1 


INTRODUCTION AND SUMMARY 


ANY of our present hopes to understand the 

behavior of matter and energy rely upon the 
notion of field. Consequently it may be appropriate to 
re-examine critically the origin and use of this century- 
old concept. This idea developed in the study of classical 
electromagnetism at a time when it was considered 
appropriate to treat electric charge as a continuous 
substance. It is not obvious that general acceptance in 
the early 1800’s of the principle of the atomicity of 
electric charge would have led to the field concept in 
its present form. Is it after all essential in classical field 
theory to require that a particle act upon itself? Of 
quantum theories of fields and their possibilities we 
hardly know enough to demand on quantum grounds 
that such a direct self-interaction should exist. Quantum 
theory defines those possibilities of measurement which 
are consistent with the principle of complementarity, 
but the measuring devices themselves after all neces- 
sarily make use of classical concepts to specify the quan- 
tity measured.‘ For this reason it is appropriate to begin 
a re-analysis of the field concept by returning to classical 
electrodynamics. We therefore propose here to go back 
to the great basic problem of classical physics—the 
motion of a system of charged particles under the 
influence of electromagnetic forces—and to inquire 
what description of the interactions and motions is 
possible which is at the same time (1) well defined 
(2) economical in postulates and (3) in agreement with 
experience. 

We conclude that these requirements are satisfied by 
the theory of action at a distance of Schwarzschild,® 
Tetrode,® and Fokker.’ In this description of nature no 
direct use is made of the notion of field. Each particle 
moves in compliance with the principle of stationary 


1 Part II of a critique of classical field theory of which another 
part here referred to as III appeared in Rev. Mod. Phys. 17, 157 
(1945), For related discussion see also R. P. Feynman, Phys. Rev. 
74, 1430 (1948). 

? Now at Cornell University, Ithaca, N. Y. 

3A, Einstein, The Meaning of Relativity (Princeton University 
Press, Princeton, New Jersey, 1945), second edition, p. 82. 

‘See in this connection Niels Bohr, Atomic Theory and the 
Description of Nature (Cambridge University Press, 1934) and 
chapter by Bohr in Einstein, of the Living Philosophers Serizs 
(Northwestern University, scheduled for 1949). 

5K. Schwarzschild, Gottinger Nachrichten, 128, 132 (1903). 

6H, Tetrode, Zeits. f. Physik 10, 317 (1922), 

7A. D. Fokker, Zeits. f. Physik 58, 386 (1929); Physica 9, 33 
(1929) and 12, 145 (1932). 


action,® 


J= — Smee f (—da,dor)+ a (€aes/¢) 


x ff s(ab,06%)(da,d6°)= extremum. (1) 


All of mechanics and electrodynamics is contained in 
this single variational principle. 

However unfamiliar this direct interparticle treat- 
ment compared to the electrodynamics of Maxwell and 
Lorentz, it deals with the same problems, talks about 
the same charges, considers the interaction of the same 
current elements, obtains the same capacities, predicts 
the same inductances and yields the same physical 
conclusions. Consequently action at a distance must 
have a close connection with field theory. But never 
does it consider the action of a charge on itself. The 
theory of direct interparticle action is equivalent, not 


5Here the letters a, 5»-- denote the respective particles, 
Particle @ has in c.g.s. units a mass of ma grams, a charge of eq 
franklins (e.s.u.), and has at a given instant the coordinates 


alsa, 

a?=a, } the three space coordinates, measured in cm. 

a=a; 

a‘=—da4, a quantity which has also the dimensions of a length. 
and which represents the product of the time coordi- 
nate by the velocity of light, ¢ (ct=“‘cotime’’). 


(Note: In comparing formulas here with those in the literature, 
note that not all authors use the same convention about signs of 
covariant and contravariant components.) 

The expression a5” is an abbreviation for the vector, a™—b™, 
Greek indices indicate places where a summation is understood 
to be carried out over the four values of a given label. The argu- 
ment ab,ab" of the delta-function thus vanishes when and only 
when the locations of the two particles in space-time can be 
connected by a light ray. Here the delta-function 4(x) is the 
usual symbolic operator defined by the conditions 6(x)=0 when 
x30 and f-..+°5(x)dx= 1. In the evaluation of the action, J, from 
(1), the world lines of the several particles are considered to be 
known for all time; i.e., the coordinates a” are taken to be given 
functions of a single parameter, ¢, which increases monotonically 
along the world line of the first particle; likewise for b, c, etc. 
An arbitrary assumed motion of the particles is not in general in 
accord with the variation principle: a small change of the first 
order, 5¢”(a), 5b”(b), --- in the world lines of the particles (this 
change here being limited for simplicity to any finite interval of 
time, and the length of this time interval later being increased 
without limit) produces in general a non-zero variation of the 
first order, 5J, in J itself, Only if all such first order variations 
away from the originally assumed motion produce no first order 
change in J is that originally assumed motion considered to 
satisfy the variational principle. It is such motions which are in 
this article concluded to be in agreement with experience. 
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to the usual field theory, but to a modified or adjunct 
field theory, in which 


(1) the motion of a given particle is determined by the sum of 
the fields produced by—or adjunct to—every particle other than 
the given particle. 

(2) the field adjunct to a given particle is uniquely determined 
by the motion of that particle, and is given by half the retarded 
plus half the advanced solution of the field equations of Maxwell 
for the point charge in question. 


This description of nature differs from that given by 
the usual field theory in three respects: 


(1) There is no such concept as “the” field, an independent 
entity with degrees of freedom of its own. 

(2) There is no action of an elementary charge upon itself and 
consequently no problem of an infinity in the energy of the 
electromagnetic field. 

(3) The symmetry between past and future in the prescription 
for the fields is not a mere logical possibility, as in the usual 
theory, but a postulational requirement. 


There is no circumstance of classical electrodynamics 
which compels us to accept the three excluded features 
of the usual field theory. Indeed, as regards the question 
of the action of a particle upon itself, there never was 
a consistent theory, but only the hope of a theory. It 
is therefore appropriate now and hereafter to formulate 
classical electrodynamics in terms of the adjunct field 
theory or the theory of direct interparticle action. The 
agreement of these two descriptions of nature with each 
other and with experience assures us that we arrive in 
this way at the natural and self-consistent generalization 
of Newtonian mechanics to the four-dimensional space of 
Loreniz and Einstein. 

It is easy to see why no unified presentation of 
classical electrodynamics along these lines has yet been 
given, though the elements for such a description are 
all present in isolated form in the literature. The 
development of electromagnetic theory came before the 
era of relativity. Most minds were not prepared for the 
requirement that interactions should be propagated 
with a certain characteristic speed, still less for the 
possibility of both advanced and retarded interactions. 
Newtonian instantaneous action at a distance with its 
century and a half of successes seemed the natural 


MOVING PELLET 
DUE AT 5.59 PM 


iN 
(8AM ond 6PM) apm) 

Fic, 1, The paradox of advanced effects, Does the pellet strike 
X at 6 p.m,? If so, the advanced field from A sets B in motion at 
1 p.m., and B moves A at 8 a.m, Thereby the shutter TS is set 
in motion and the path of the pellet is blocked, so it cannot 
strike X at 6 p.m. If it does not strike X at 6 p.m., then its path 
is not blocked at 5.59 p.m. via this chain of actions, and therefore 
the pellet ought to strike X. 
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framework about which to construct a description of 
electromagnetism. Attempt after attempt failed.? And 
unfortunately uncompleted was the work of Gauss, 
who wrote to Weber on the 19th of March, 1845: “I 
would doubtless have published my researches long 
since were it not that at the time I gave them up I had 
failed to find what I regarded as the keystone, Nil actum 
reputans si quid superesset agendum: namely, the deriva- 
tion of the additional forces—to be added to the 
interaction of electrical charges at rest, when they 
are both in motion—from an action which is propagated 
not instantaneously but in time as is the case with 
light.”"° These failures and the final success via the 
apparently quite different concept of field were taken 
by physicists generally as convincing arguments against 
electromagnetic action at a distance. 

Field theory taught gradually and over seven decades 
difficult lessons about constancy of light velocity, about 
relativity of space and time, about advanced and 
retarded forces, and in the end made possible by this 
circuitous route the theory of direct interparticle 
interaction which Gauss had hoped to achieve in one 
leap. On this route and historically important was 
Liénard" and Wiechert’s" derivation from the equations 
of Maxwell of an expression for the elementary field 
generated by a point charge in an arbitrary state of 
motion. With this expression as starting point Schwarzs- 
child arrived at a law of force between two point charges 
which made no reference to field quantities. Developed 
without benefit of the concept of relativity, and 
expressed in the inconvenient notation of the prerela- 
tivistic period, his equations of motion made no appeal 
to the physicists of the time. After the advent of 
relativity Schwarzschild’s results were rederived inde- 
pendently by Tetrode and Fokker. These results are 
most conveniently summarized in Fokker’s principle 
of stationary action of Eq. (1). 

To investigate the consistency of the Schwaraschild- 
Tetrode-Fokker theory of direct interparticle inter- 
action and its relation to field theory, we have first to 


®For a stimulating and instructive if not always objective 
account of early researches on field theory and action at a distance 
see A, O’Rahilly, Electromagnetics (Longmans, Green and Com- 
pany, New York (1938)). See also J. J. Thomson, Report of the 
British Assn, for the Adv, of Science for 1885, p. 97; J. C. Maxwell, 
Electricity and Magnetism (Oxford University Press, London, 
1892), third edition, Chapter 23); R. Reif and A. Sommerfeld, 
Encyclopadie der Math. Wiss. 5, Part 2, Section 12 (1902). A 
recent very brief account has been given by H. J. Groenewold, 
report on Puntladingen en stralingsveld, Ned. Nat. Ver., Amster- 
dam (May 1947). M. Schénberg regards field and direct action 
not as two equivalent representations of the same force, but as 
two different parts of the total force: Phys. Rev. 74, 738 (1948); 
Sum. Bras, Math. 1, Nos, 5 and 6 (1946); J. L. Lopes and M. 
Schénberg, Phys. Rev. 67, 122 (1945). 

C.F, Gauss, Werke 5, 629 (1867). 

1A, Liénard, L’Eclairage Electrique 16, pp. 5, 53, 106 (1898). 

EE. Wiechert, Archives Néerland (2) 5, 549 (1900); Ann. d 
Physik 4, 676 (1901). Compare these derivations in prerelativistic 
notation with that given for example by W. Heitler, The Quantum 
Theory of Radiation (Oxford University Press, New York, 1944), 
Coit) edition, p. 19, or A. Sommerfeld, Ann, d. Physik 33, 668 
(1910). 
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examine in the next section the paradox of advanced 
interactions. In the following section is recalled the 
derivation of the equations of motion from the variation 
principle. Next these equations of motion are shown to 
satisfy the principle of action and reaction as generalized 
to the non-instantaneous forces of a relativistic theory 
of action at a distance. In a subsequent section the 
corresponding formulation of the laws of conservation 
of energy and momentum is given. Finally the con- 
nection is established between these conservation laws 
and the field-theoretic description of a stress-energy 
tensor defined throughout space and time. 


THE PARADOX OF ADVANCED ACTIONS 


The greatest conceptual difficulty presented by the 
theory of direct interparticle interaction is the circum- 
stance that it associates with the retarded action of a 
on 0b, for example, an advanced action of 6 on a. A 
description employing retarded forces alone would 
violate the law of action and reaction or, in mathe- 
matical terms, could not be derived from a single 
principle of stationary action. 

Advanced actions appear to conflict both with 
experience and with elementary notions of causality. 
Experience refers not to the simple case of two charges, 
however, but to a universe containing a very large 
number of particles. In the limiting case of a universe 
in which all electromagnetic disturbances are ultimately 
absorbed it may be shown’ that the advanced fields 
combine in such a way as to make it appear—except for 
the phenomenon of radiative reaction—that each 
particle generates only the usual and well-verified 
retarded field. It is only necessary to make the natural 
postulate that we live in such a completely absorbing 
universe to escape the apparent contradiction between 
advanced potentials and observation. 

In a universe consisting of a limited number of 
charged particles advanced effects occur explicitly. It 
is no objection if the character of physics under such 
idealized conditions conflicts with our experience. It is 
only required that the description should be logically 
self-consistent. In particular in analyzing the behavior 
of an idealized universe containing only a few particles 
we cannot introduce the human element as we know it 
into the systems under study. To do so would be to 
assume tacitly the possibility of a clean cut separation 
between the effects of past and future. This possibility 
is denied in a description of nature in which both 
advanced and retarded effects occur explicitly. 

The apparent conflict with causality begins with the 
thought: If the present motion of a is affected by the 
future motion of b, then the observation of a attributes 
a certain inevitability to the motion of 5. Is not this 
conclusion in direct conflict with our recognized ability 
to influence the future motion of 8? 

All essential elements of the general paradox appear 
in the following idealized example: Charged particles a 
and 6 are located in otherwise charge-free space at a 
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distance of 5 light-hours. A clockwork mechanism is 
set to accelerate a at 6 p.m. Thereby 0 will be affected, 
not only at 11 p.m, via retarded effects, but also at 
1 p.m. via advanced forces. This afternoon motion will 
cause @ to suffer a premonitory movement at 8 a.m. 
Seeing this motion in the morning, we conclude the 
clockwork will go off in the evening. We return to the 
scene a few seconds before 6 p.m. and block the 
clockwork from acting on a, But then why did a move 
in the morning? 

To formulate the paradox acceptably, we have to 
eliminate human intervention. We therefore introduce 


SPEED OF SHUTTER DURING DAY AS FUNCTION 
OF IMPULSE A RECEIVED AT 8 AM Ap HENCE 
AS DEPENDENT ON POSITION OF SHOTTER AT 559 PM 


DISPLACEMENT 
OF SHUTTER 
AT 559 PM 


SHUTTER OPEN 
AT 5.59 PM 
DISPLACEMENT OF SHUTTER 
PROPORTIONAL TO iTS SPEED 


SPEED OF MOVING SHUTTER OURING DAY 


Fic. 2. Analysis and resolution of the paradox of advanced 
effects. The action of the shutter on the pellet—the interaction 
of past and future—is continuous (dashed line in diagram) and 
the curves of action and reaction cross, See text for physical 
description of solution. 


a mechanism which saves charge a from a blow at 
6 p.m. only if this particle performs the expected 
movement at 8 a.m. (Fig. 1). Our dilemma now is this: 
Is a hit in the evening or is it not? If it is, then it 
suffered a premonitory displacement at 8 a.m. which 
cut off the blow, so a is mot struck at 6 p.m.! If it ts 
not bumped at 6 p.m. there is no morning movement 
to cut off the blow and so in the evening a is jolted! 

To resolve, we divide the problem into two parts: 
effect of past of @ upon its future, and of future upon 
past. The two corresponding curves in Fig. 2 do not 
cross. We have no solution, because the action of the 
shutter on the pellet, of the future on the past, has been 
assumed discontinuous in character. 

The paradox, and the case it presents against ad- 
vanced potentials, evidently depends on the postulate 
that discontinuous forces can exist in nature. From a 
physical point of view we are led to make just the 
contrary assumption, that the influence of the future 
upon the past depends in a continuous manner upon 
the future configuration. 

Our general assumption about continuity is explicitly 
verified in the present case. The action of shutter on 
pellet is not discontinuous. The pellet will strike the 
point S a glancing blow if the shutter lies only part way 
across its path (dashed curve in Fig, 2). 

Of the problem of influence of future upon past, and 
past upon future, we now have in Fig. 2 a self-consistent 
solution: Charge @ by late afternoon has moved a 
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very slight distance athwart the path of the pellet. 
Thus one second before 6 p.m. it receives a glancing 
blow in the counter-clockwise sense and at 6 p.m. 
a stronger acceleration in the clockwise direction. The 
accelerations received by a at these two moments 
are by electromagnetic interaction transmitted in re- 
duced measure to } at 1 p.m. and back from 6 in yet 
greater attenuation to a. Thus this particle receives 
one second before 8 a.m. a certain counter-clockwise 
impulse and at 8 a.m. an opposite impulse. The net 
rotational momentum imparted to the lever is clock- 
wise. It carries the point S in the course of 10 hours 
the necessary distance across the path of the pellet. 
The chain of action and reaction is completed. The 
paradox is resolved. 

Generalizing, we conclude advanced and retarded 
interactions give a description of nature logically as 
acceptable and physically as completely deterministic 
as the Newtonian scheme of mechanics. In both forms 
of dynamics the distinction between cause and effect is 
pointless. With deterministic equations to describe the 
event, one can say: the stone hits the ground because 
it was dropped from a height; equally well: the stone 
fell from a height because it was going to hit the ground. 

The distinction between Newtonian and relativistic 
mechanics is one of detail—instantaneous interactions 
versus forces unconfined to a single plane in space time. 
The interrelations between the world lines are more 
complicated than those of Newtonian mechanics, but 
just as definite. There a well-defined division of past 
and present was possible; here these divisions of time 
are inextricably mixed. 


EQUATIONS OF MOTION 


Advanced and retarded forces being accepted on 
equal footing in the description of nature, we now 
reproduce the derivation from Fokker’s action principle 
of equations of motion which contain them both. Let 
the world line of a typical particle a be altered from 
a™(a) to a"(a)+5a™(a). Let the abbreviation be intro- 
duced, 


AO@=% f 5( xb pxb*)d bp (b) (2) 


(vector potential of particle 6 at point x). Also denote 
by a”'(a) the derivative da”(a)/da, Then the change 
in action produced by the alteration of the world line 
of a is 


a=mac f {a,'(da*)'/(—a,'a"')}da 


+6 DO (€0/c) 


bea 


A, (a)a”’ (a)da, 


or, by partial integration, and dropping terms at the 
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limits where the variations 6a” vanish, 
6J = f 40x00)! — mac(d/da)[an’/(—a,/a")*] 


+(ea/c) X [A,/da") — (04m /da*) Ja”’}. 


ba 


(3) 


The condition that 5/ be zero to the first order for 
arbitrary da” is the vanishing of the curly bracket in 
(3) for all four values of m, whence result the four 
components of the equation of motion for particle a. 
Instead of expressing the motion in terms of the arbi- 
trary parameter a, introduce a new parameter, a=a(a), 
the “proper cotime,” defined in terms of a up to an 
unimportant additive constant by the equation da/da 
=(~—a,’a’”)t and denote by dots derivatives with 
respect to the proper cotime. Introduce also the 
abbreviation 


Fran (x) =0A n (x) /0x™—OAm© (x) /Ox" (4) 


(field at point x due to 4).'° Then the four-vector 
equation of motion takes a form, 


MaC*lim = Ca 2. Fimp\” (a) a?, (5) 
bxa 


identical with that of Lorentz, with the following 
exceptions: self-actions are explicitly excluded; no fields 
act except those adjunct to the other particles; each 
such adjunct field is uniquely determined by the 
prescription of Eqs. (2) and (4). 

Now we come to the well known proof that each 
adjunct field satisfies Maxwell’s equations when for 
charge and current are introduced the appropriate 
expressions for the given particles. We employ Dirac’s 
identity!* 


(8°/Ox,0x")5(xb,xb”) = — 415 (x1— by) 5(x2— bz) 
X6(x3—~ b3)8(x4— b,), (6) 


multiply both sides by dd,,(8) = b,,(8)d8, integrate with 
respect to 8 from —~ to +, and conclude that 
A, (x) satisfies the equation 


(8°/Ax,0x") Am (x) = —4t jm (x). (7) 


Here 


+00 
ini(a)—es f 5(x1— 51) 6(x2— be) 
a X5(st3—b3)6(t4—by)bm(B)4B (8) 


is an abbreviation for the density-current four-vector 
at point x due to particle 6, an obviously singular 
quantity, obeying certain evident conservation rela- 


B The electric field Fy is Fu=—Fa: and the magnetic field Hz 
is F2x=—F 2: the vector potential A, is A'=Aj and the scalar 
potential is A4+= — A, Likewise in Eq. (8) j4=—-j, represents the 
charge density in franklins (e.s.u.)/cm* and jl=j, gives (1/c) 
times (x-component of the charge flux in franklins/cm? sec.). 

“Pp, A. M. Dirac, Proc. Roy. Soc. London. A167, 148 (1938). 
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tions. The vector potential (2), in addition to satisfying 
the inhomogeneous wave equation (6), has a four- 
dimensional divergence which vanishes: 


+00 
(0/0x,)A , (x) = eof 8’ (xb xb’) 2xb4b dB 


0 


= 2e,5(xb,xb’)| =0. (9) 


2 


We differentiate this zero divergence with respect to x” 
and subtract from it (7), obtaining the field equations 


OF mu (x) /Oxu= 4 jm, (10) 


equivalent to the usual relations divE=4zp and curlH 
=H/c+4nJ,/C. The other pair of Maxwell’s equa- 
tions follow identically from the definition (4) of the 
F’s in terms of the A’s, 

The fields (2) are distinguished from all other solu- 
tions of Maxwell’s equations by being half the sum of 
the advanced and retarded Liénard-Wiechert potentials 
of particle 5: 


dbm/ dB 
Am (x) = ey f ————— 6 (xb xb’) d(xbpxb*) 
d(xb xb") /dB 


= (1/2) Rm (x)+ (1/2) Sm (x). (11) 


Here, for example, R represents the retarded potential 
Rm) (x) = e,6”/b+bx,, (12) 


evaluated at that point on the world line of 6 which 
intersects the light cone drawn from the point of 
observation into the past: 


xbuxb4=0; xt>d4, (13) 
and S similarly represents the advanced potential. 

By way of illustration of these results in familiar 
cases consider first the case of a point charge, 8, at rest 
at the origin. Then retarded and advanced fields are 
identical, all components of the four-potential vanish 
except the last, 64=d(cotime)/d(proper cotime)=1, 
bx4= b4— x4= 24— b4= elapsed cotime= distance to point 
of observation=r, and the scalar potential has the 
familiar value e,/r. Next, in the case of a slowly moving 
point charge, it similarly follows that A™= ey(b™/2r) ret 
+e,(6"/2r)aav. If this point charge is at the same 
time being accelerdted, then the derived electric field 
has at large distances the value E= —e5(b,/2r)ret 
—(b,/2r)sav, where 6, is the component of the three- 
vector 5 perpendicular to the line r. This result refers 
only to the field of the particle in question. In the 
idealized case of a universe containing charged particles 
sufficient in number to absorb all electromagnetic 
disturbances, the advanced fields of the particles of the 
absorber will combine with the given field to produce 
the full retarded field of experience, —es(b,/r)ret, aS 
shown in III. As a final example consider a fixed linear 
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conductor past any point of which flow per second i/e 
particles of charge e. The interval of cotime between 
the &th and the (&+1)st particle is ce/i. The coordi- 
nates of the &th particle are 


kv(y)=s™(y)  (m=1, 2, 3) 
RA(y)=s*(y)+kee/t (k= — m0, +++, —1,0,1,---) (14) 


where s™(y) is the parametric representation of the 
curve of the wire. The four-potential at a point of 
observation an appreciable distance from the wire is 
obtained by summing over all the particles or equiva- 
lently, because of the close spacing of the charges, by 
integrating over k: 


Am(x)=e f f aly) — (ct —s4(y) — bee /i)?] 
<dk(ds"(y)/dy)dy 


i f asmenyferc) for m=1, 2,3 


ef at/r 


Here r(y) is the magnitude of the vector x(y), y(y), 2(y) 
which runs from the point y of the curve to the point 
of observation. The scalar potential of Eq. (15) will 
normally be compensated wholly or in part by contri- 
butions from opposite charges at rest and need not be 
considered here. From the vector potential follows an 
expression for the magnetic field 


(15) 


for m=4, 


H=curlA= if sxn/er, (16) 


identical with that due to Ampere. 

To go further in deriving well known results would be 
pointless. Adequate textbooks exist. They treat well 
defined problems of electromagnetism, where there is 
no compelling reason to consider a particle to act on 
itself, Thus all their analyses are immediately trans- 
latable into terms of the present modified or adjunct 
field theory. However, this point of view is mathemati- 
cally identical with that of action at a distance. Conse- 
quently the theory of direct interparticle action, far 
from attempting to replace field theory, joins with field 
theory to provide the science of electromagnetism with 
additional techniques of mathematical analysis and to 
facilitate deeper physica! insight. The rest of this 
article may illustrate how the two points of view join 
hands to elucidate in four-dimensional mechanics the 
principle of action and reaction and the laws of conser- 
vation of momentum and energy. 


ACTION AND REACTION 


Laws of conservation of angular momentum, energy 
and linear momentum are well known to exist in any 
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theory for which the equations of motion are derivable 
from an action principle which is invariant with respect 
to rotation, translation, or displacement of the time 
coordinate.'® Thus Fokker!® has derived an energy- 
momentum conservation principle for an idealized 
situation in which there are only two particles, of which 
a acts on 6 via purely retarded forces. The present 
treatment is the natural generalization of Fokker’s 
analysis to the case of a theory which is symmetric 
between every pair of particles and which is based on 
the action principle (1). It will be sufficient to prove 
the conservation law for a single pair of particles in 
order to see the corresponding result for a system of 
particles. 

For the typical particle a let the four-vector of energy 
and comomentum be denoted by 


G'=G,)three space components of 

mett-w/ertm {6s} the kinetic comomentum 
G=G3) (velocity of light times 

kinetic momentum: ex- 

pressible in energy units). 


me?(1—v?/c?)-t=G4==—G., kinetic energy plus rest- 
mass energy. 


Then the change in kinetic comomentum and energy 
in the interval of proper cotime, da, on account of the 
action of particle b follows directly from the equations 
of motion (5) and the expressions (4) and (2) for the 
force coefficients: 


Gn (cx) == MeCimda= ege,da# 
x | /@a") f bu— (8/da*) f bn | (obsab as (17) 


We carry out the differentiations with respect to the 
coordinates @ and add to the result the following zero 
quantity 


+00 
Cale f (d/d8)5(ab,ab’)dB, (18) 


thus finding for the impulse 


B=t-00 
dGmn (cx) = 2eaep f 5’(ab, ab’) 


B=m—co 
X (ab,,da"db ,— db,daab,— damdb“ab,). (19) 


In this expression the integrand is changed in sign but 
unaltered in value by an interchange of the roles of 
particles a and 6. 

To the result just obtained we give the following 
obvious interpretation: 


(1) The right hand side of (19), after removal of the integral 
sign, represents in terms of the symbolic delta-function the 


18 E. Noether, Géttinger Nachrichten, Math. Phys. Klasse. 235 
(1918) : E. Bessel-Hagen, Math. Ann. 84, 258 (1921). 
16 A.D. Fokker, Zeits. f. Physik 58, 386 (1929). 
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transfer of impulse or energy to @ during the stretch of cotime 
da from effects which originate at 5 in the cotime interval dg. 

(2) There is no energy or impulse transfer except when the 
stretch df of the world line of 6 is intersected by either the forward 
or backward light cone drawn from a: ie., 6 acts on @ through 
both retarded and advanced forces. 

(3) The impulse communicated to a over the portion da of its 
world line via retarded forces, for example, from the stretch dg 
of the world line of is equal in magnitude and opposite in sign 
to the impulse transfer from @ to 6 via advanced forces over the 
same world line intervals (equality of action and reaction). 


The relativistic generalization of the Newtonian 
principle of action and reaction as just stated is obvi- 
ously not identical with the non-relativistic formulation. 
In no Lorentz frame of reference are action and reaction 
simultaneous. For the instant at which @ experiences a 
force from 6 there is not one corresponding time at 
which 6 gets a back reaction, but two instants.'? Thus 
for a given point on the world line of @ we can make 
two statements about the transfer of energy (or 
impulse) from b. Each statement refers to a single one 
of the two parts of the total transfer. It is evidently 
reasonable that the law of action and reaction should 
have this Jacob’s ladder character in 4-dimensiona] 
space-time. 


ENERGY AND MOMENTUM OF INTERACTION 


Considering two isolated particles @ and 6, we 
immediately conclude from the law of action and 
reaction as just stated the constancy in time of the 
total energy and comomentum four-vector 


Gin (cr, B) = mac?dm(cx)-+-mc*bm(B) 


+20 =f. f+fF. | a ab,ab") 


(20) 
(dbmda"db p— dbmda4ab,— damdb“ab,) = (constant). 


In the case of more particles we have a corresponding 
expression with a kinetic term for each individual 
particle and an interaction term for each pair of charges. 
Thus G,, becomes a function of as many parameters 
a, 8, y, -*- a§ there are particles. To prove constancy 
with respect to a given parameter, such as a, we have 
only to differentiate (20) and insert for mac?@n(a) the 
quotient dG,,“(a)/de obtained from (19). 

Evidently we have in (20) what may be called a 
many-time formulation of the conservation laws, de- 
rived of course from the equations of motion, but from 
which conversely the equations of motion are derivable 
with equal ease. 

The interpretation of the double integral in (20) as 
an interaction energy is obvious in the case of two 
stationary charges separated by a distance R. Thus 
by integration we find for G‘ the familiar result mac? 
+ mic?+eaes/R. 

17L. Page, Am. J. Phys. 13, 141 (1945), has reviewed the com- 


plications which come from comparing action and reaction at the 
Same time. 
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In the case of individual moving charges it is some- 
times convenient to add to the idea of kinetic co- 
momentum and energy G,,“ the notion of potential 
comomentum and energy 


Um = €a D Am (a(a)), (21) 
bxa 
and total comomentum and energy, 
Pp? = Gy Un’, (22) 


In terms of these expressions, the four-vector of energy 
and comomentum of the whole system takes the form 


Gna, B, mm => PO (a)+ >» 2eaey 
a a<b 


x tf - f f 6’(ab,ab")abpdadb,. (23) 


a a) 0 “B 


The summation of the potential energies so to speak 
counts twice the interaction between each pair of 
particles. The double integrals in (23) correct for this 
overcount. 

From either Eq. (20) or Eq. (23) for the energy of 
the system it is clear (see Fig. 3) that the electromag- 
netic energy of a finite number of particles is definable 
from a knowledge of only a finite stretch of their world 
lines. It is also evident that particles which come 
together in otherwise charge free space, interact, and 
then separate in a regular way, will in the end experi- 
ence no net loss of energy to outer space. Both features 
of the four-vector G,, are reasonable in the mathe- 
matical description of a physically closed system. 


RELATION OF INTERACTION ENERGY TO 
FIELD ENERGY 


In field theory it is customary to attempt to define 
throughout space a symmetrical stress energy tensor!® 
TImn(x) with the following properties: 


(1) The divergence @T,/dx, vanishes at every place where 
there is no particle. 

(2) At the location of a typical charge a this divergence becomes 
singular in such a way that its integral over a small volume 
element containing the charge gives the value of the electro- 
magnetic force acting on that charge: 


~ auf J f (8T np /Oxy)dx'd2dx?= maedm 


neighborhood 
of a 


(24) 


when the integration extends over a region of constant time which 
contains a, When the integration proceeds over an arbitrary 
space-like region or “surface,” o, such that no pair of points in 


18 Typical components are 

Tu, force in positive x-direction across unit area in yx plane 
exerted upon medium on negative side of plane by medium 
on positive side (equal in the Maxwell theory to (8x) 
xX (HP —- Hy H P+ E— E?—E?)). 

Tu, velocity of light times energy flux in x direction per cm? of 
ye plane and per sec. (Maxwell value (4r)"'(E,H.— E.H,)). 

Tu, negative of the energy density (usual expression — (8x) 
X (P+). 
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Fic. 3. Interactions considered in formulating the law of 
conservation of momentum and energy. Note that the stretches 
of world line from a@adv to @ret and from Badv to Bret completely 
determine the value of the energy-momentum four vector Gn(a, 8). 
It is also natural to specify these two world-line segments as 
initial conditions in dealing with the two-particle problem. 


the surface can be connected by a light ray, then the corresponding 
statement is 


matin au ff f(T n»/ax)do*=0. (25) 


neighborhood 
a 


Here, if the surface is defined by a parametric representation in 
terms of three quantities u, v, w, then 
dot=(A(a!, x2, x9) /a(u, v, w) \dududw 
with corresponding expressions for the other three components 
of do, 
(3) For every space-like surface o there is defined a four-vector of 
energy and comomentum 


Gmn(o) = Ze mac*dm (a) + f f f Tmado™, (26) 


which is conserved in the sense that its value is completely 
independent of the choice of ¢, Thus consider a change ée in the 
surface o—i.e., an alteration from x"(u, v, w) to x*-+dx"(u, v, w) 
—and the associated alterations da, d8, --+ in the points where 
the respective world lines intersect this surface, Then the change 
in Gm is expressible via the theorem of Gauss in terms of an 
integral over the volume, w, comprised between the two surfaces: 


Gna Zu mactindat [fff @Tma/dxado. (21) 


But the integrand vanishes everywhere except in the immediate 
neighborhood of the typical particle, a, and there—writing 
dw=da,de*, and using (25)—we conclude that the contribution 
from the integral just cancels out the first term in 6G,. 


Is there any choice of the tensor 7, in the adjunct 
field theory which will yield for the energy-comomentum 
vector Gn(o) of (26) a value identical with the corre- 
sponding vector Gn(a,8---) of the theory of direct 
interparticle action? The appropriate tensor may be 
constructed when one recalls that the field of a given 
particle is to produce changes only in the motions of 
the other particles, and that the principle of action and 
reaction connects the retarded effects exerted for 
example by @ on 0 via the retarded field (1/2)Rma™ 
with the advanced effects exerted by b on a via the 
advanced field (1/2)Simnn® : 


Tinn(X) = > (R® (x) & SO (x) ns (28) 
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Here R and S denote the retarded and advanced 
Liénard-Wiechert fields, so that Fin =(1/2)Rmna™ 
+(1/2)Smn@. For a convenient abbreviation we have 
adopted the notation 


(R & S)mn= (Rm Sunt Sm! Runt e8maRe’ Su») /8e (29) 


with gma=O for mn and gi1= goo= gag= 1= — gas. 
That the tensor Tinn of (28) does lead to the energy- 
comomentum four-vector (20) of the theory of action 
at a distance is proven in the appendix. Here we shall 
only establish that the stress energy tensor satisfies the 
conditions (1) and (2) (and hence (3)). Thus, we 


TABLE I. Correspondence of principal alternative expressions 
for interaction energy in adjoint field theory and in theory of 
direct interparticle interaction. 


Canonical form Frenket form 


Basic type of field 


: f Those partial fields which Total (time-symmetric) 
coupling envisaged 


are reciprocally responsi- _ field adjunct to each 
ble for equality of action of the coupled particles 
and reaction 


Typical term in RO & Se) FO) & Fe) 
stress-energy tensor 

Expression for inter- Eq, (20) Eq. (20) plus expres- 
action energy: sion (40) 


Finite stretch of the 
two world lines 


Depends upon: Shape of the two 


world lines from 
t=—*% to +” 


evaluate the divergence of the typical term in the 
tensor of Eq. (28), finding 


OT my /OXp= > {(S@9/16m)(ORm( /Ox" 
ard 


FOR py /Ox™+-OR ym /Ox?) 
+ (Sm? /8m) (OR pa / Axe) 
+similar term with S® and 
R® interchanged}. (30) 


Here the first three cyclically related terms cancel, as 
seen for example from the antisymmetrical representa- 
tion of the fields via potentials; and the divergence of 
R gives the same charge and current distribution (8) 
which appeared in the time-symmetric case. Using 
this circumstance, and combining terms, we have 


OT malOXa= Y. Finy (x) j#(x) 
ba 


= 5 Fas (ae fe anoet—o" 
X 6(8— a3) 6(x — a) d4(a)da 
= = fa a')8(a«?— a?)5(23— a) 
X 6(04— a4) meC?dimda, 


in complete satisfaction of requirements (1) and (2). 


(31) 


WHEELER AND R. 
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As alternative choice for the stress energy tensor 
which also has the properties (1), (2) and (3) is that 
proposed by Frenkel,!® who was among the first to 
stress the notion of fields as always adjunct to specific 
Particles: 


Tnn* (x)= 0 (FO (x) & FO (x)) mane 


ab 


(32) 


Thus the difference between Frenkel’s tensor and the 
canonical tensor (28) is a quantity 


T mn > (4R® ans $5) & (ER os 45) 
a xb 


Tnn* — (33) 


which has everywhere a zero divergence. 

The possibility of more than one expression for the 
stress-energy tensor with the same divergence is well 
known in the usual single-field formulation of electro- 
dynamics,” and is not surprising here. However, the 
expressions for field energy also turn out to differ 
(Table I). 

The energy-comomentum four-vector G,, defined by 
(26) and (28), and the alternative four-vector G,,* 
defined by (26) and (32), are both ordinarily finite for 
a system of point charges. In illustration, note that 
near a typical particle a the corresponding field varies 
as 1/r*, the field of any other particle 0 is finite, the 
volume element is proportional to 4mr’dr and the 
integral of (26) converges, yielding for example in the 
interaction energy ¢sé,/rev for two stationary point 
charges separated by the distance ras. The density of 
field energy, while finite, is not positive definite, even 
for two particles of the same charge. Also the flow of 
energy and momentum may have finite values at a 
point in space where the total field, F@4+FO+---, 
actually vanishes. This result, unexpected from the 
point of view of the usual field theory, nevertheless 
presents no logical difficulties. 


ENERGY OF RADIATION 


The canonical and the Frenkel tensors, which give 
the same interaction energy in the case of two charges 
which are at rest, give different results for the case of a 


TABLE II. Energy flux at distance r from accelerated charge for 
adjunct field theory in completely absorbing universe. 


Time of observation 


relative to moment Form of stress-energy tensor 


of acceleration Canonical Frenkel Maxwell 
v/e seconds earlier no flux ~~ E2/8x towards no flux 
the source 
r/c seconds later E?/4x outward E?/8x outward E1/4x outward 
at other times no flux no flux no flux 


J. Frenkel, Zeits. f. Physik 32, 518 (1925). See also J. L. 
Synge, Trans. Roy. Soc. Canada 34, 1 (1940) and Proc. Roy. Soc. 
London A177, 118 (1940) as well as the discussion of Synge’s 
treatment in III. 

20 See in particular M. H. L. Pryce, Proc. Roy. Soc. London. 
A168, 398 (1938). 
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single accelerated particle in a completely absorbing 
universe. There we have in the neighborhood of the 
radiating source FO) = (1/2)R@-+(1/2)S@ and F® 
+FO---++=(sum of advanced fields of absorber parti- 
cles) = (1/2)R™—(1/2)S@. For the parts of these 
fields which are proportional! to the acceleration of the 
charge, and which vary at large distance as 1/r, we 
have for R and S® respectively a zero value except 
for an instant r/c seconds after or before the moment 
of acceleration. The corresponding energy flux (Table 
II) satisfies in both the Frenkel and the canonical 
formulations the law of conservation of energy, but 
agrees only in the canonical case with customary ideas 
of energy localization. From the standpoint of pure 
electrodynamics it is not possible to choose between 
the two tensors. The difference is of course significant 
for the general theory of relativity, where energy has 
associated with it a gravitational mass. So far we have 
not attempted to discriminate between the two possi- 
bilities by way of this higher standard. 


CONCLUSION 


We conclude that the theory of direct interparticle 
action, and the equivalent adjunct field theory, provide 
a physically reasonable and experimentally satisfactory 
account of the classical mechanical behavior of a system 
of point charges in electromagnetic interaction with one 
another, free of the ambiguities associated with the 
idea of a particle acting upon itself. 


APPENDIX 


To compute the integral of the field energy which appears in 
(26), we express each field as a superposition of elementary fields 
from each infinitesimal range of path da, and the tensor Tiny or 
Tnn* as the superposition of parts due to stretches da of the world 
line of a and d§ of b. We use the notation Ttdadf, Ftda to indicate 
each such elementary contribution to T, F, etc. Thus the four- 
potential (Qtda arises from a charge which appears for an instant 
at a(a) and disappears at a(a-+da). 

The lack of conservation of the charge which generates the 
elementary potential causes the four-divergence of (Tf to equal a 
non-zero scalar, r, 


5(ax,ax") for xi>a‘| 

0 for xi<atf 
o(ax ax") for x*>a! 

0 for coup (34) 
whose integral however satisfies the conservation condition 
S2°r(x, a)\da=0. This circumstance permits some latitude in 
the definition of the elementary field in terms of the potential. 
It will prove useful to adopt the definition 


Ran ®t =9Rn® /ax™—IRm® /dx"— (r)/2)gmn. (35) 


The elementary field is not antisymmetrical in the indices m and 
n, but the normal field Rin =/Rma@tda changes sign of course 
on this interchange of labels. 

The elementary component of the stress-energy tensor is not 
symmetric in its two indices, but its divergence is found by direct 


IRwOt Bry = (8/Axy) Deas, 


=x 7) (x, a) = featyzal 


1 See part III for fuller discussion, 
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SPACE LIKE SURFACE 
CONSIDERED IN 
EVALUATION OF 
FIELD ENERGY 


IME 
REGION \CONSIDERED 
IN 4-FOQD INTEGRAL 


Fic. 4. Contribution to canonical expression for field energy 
which arises from coupling of retarded field of @ and advanced 
field of 5. 


algebra to have the simple value 
(8/Ax_) (ROT & St) my = (1/8) (—2R Ot /ax7A xq) 
X (Spt? — (5 /2) 5m) + (1/8mr) (— 32S, 'T /Axg0x7) 
x (Rote — (r°@) /2) 8m?) , 
where the typical field d’Alembertian has the value 


— PR mt /ax%Axq 
= 41re,6(x!— a1) 8 (x? — a?) 8(x3 — 23) 8(x4— 24) am. (37) 


We integrate (36) over a four-dimensional region of the form 
shown in Fig. 4. Of the terms on the right the second vanishes 
throughout this region, and the first gives 


(€./2) (Smp°t (a) — (gmp/2)s(a)) a? 
= (€./2) (8S pT /da™ — ASm Tt /3a° — gmpIS yt /Bay) a? 
= 26,6y(dbm 4" — bm * aby" — dm abib*)5'(ab,ab’) (38) 


when b4>a‘, and zero otherwise. The four-integral on the left 
hand side may be expressed via the theorem of the Gauss in the 


form 
— ff fvr & SON) npdo?, 


Here the integral, which goes over the whole of the three-dimen- 
sional region or “‘surface’’ in the figure, contributes only over the 
upper region because of the vanishing elsewhere of at least one 
of the fields in question. The elementary contributions just 
computed we now sum over the world line of @ from —« toa 
and over the world line of 5 from 8 to «, where a and § determine 
the points where the world lines of a and 4 intersect the space-like 
surface ¢, We have then only to erase the daggers in (39). The 
converse expression, with R® & S©@), we obtain by interchanging 
the roles of 6 and a in (38) and in the limits of integration. In 
this way follows at once the identity of expression (20) for the 
energy in the theory of direct interparticle interaction and the 
canonical expression (26-28) of the adjunct field theory. 

When instead the Frenkel expression (32) is used for the 
stress-energy tensor, then there results an increment in the 
energy-comomentum four-vector given by the expression 


(36) 


(39) 


Gn*(a, 8) —Gm(ar, 8) = ete ff (bmi 


j , A 8’(ab,ab’) for 
- tbo dnad,bh){ Steer) for oS pe dad 


=aef f +68 for at<bt 


Aa 
=8' for bt>at [ebm 
a covariant which is independent of @ and 8 and which has an 
interesting relation to the two world lines in question. 


at<bi 


da,db“, (40) 
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Ordinarily it is assumed that interaction between charges occurs along light cones, that is, 
only where the four-dimensional interval s?=#—r* is exactly zero. We discuss the modifications 
produced if, as in the theory of F. Bopp, substantial interaction is assumed to occur over a 
narrow range of s? around zero. This has no practical effect on the interaction of charges which 
are distant from one another by several electron radii. The action of a charge on itself is finite 
and behaves as electromagnetic mass for accelerations which are not excessive. There also 
results a classical representation of the phenomena of pair production in sufficiently strong 


fields. 


UANTUM electrodynamics is built from a 
classical counterpart that already contains 
many difficulties which remain upon quantiza- 
tion. It has been hoped that if a classical electro- 
dynamics could be devised which would not 
contain the difficulty of infinite self-energy, and 
this theory could be quantized, then the problem 
of a self-consistent quantum electrodynamics 
would be solved. For this reason many successful 
attempts have been made to produce such a 
classical theory. The field equations can be 
made non-linear,! the fields produced by or 
acting on an electron can be redefined,”-* or one 
may resort to some averaging of the fields over 
space or time.‘ These theories have, however, 
met with considerable difficulties when an at- 
tempt has been made to quantize them. In this 
paper a consistent classical theory is described 
which the author believes can be quantized. 
Some preliminary results of the quantization of 
this theory will be discussed in a future paper. 
Some of the physical ideas of the classical form 
1M. Born and L. Infeld, Proc. Roy. Soc. London A144, 
425 (1935). 

2p. A. M. Dirac, Proc. Roy. Soc. London A167, 148 
(1938). An excellent discussion of these matters is given 
by C. J. Eliezer, Rev. Mod. Phys. 19, 147 (1947). 

3J. A. Wheeler and R. P. Feynman, Rev. Mod. Phys. 
17, 157 (1945). 

‘There are many theories of this nature. The author's 
theory is essentially that of F. Bopp, Ann d. Physik 42, 
573 (1942). R. Peierls and H. McManus have developed 
a theory in which the electron is pictured as a rigid dis- 
tribution of charge in both space and time. The theory can 
be shown to be exactly equivalent to the present one, at 
least for a class of f functions. Their physical ideas may 
offer advantages over the present one in which the function 
f is not so directly interpretable. I thank Dr. McManus 
for a copy of his thesis. For a summary of another theory 
of this type see B. Podolsky and P. Schwed, Rev. Mod. 


Phys. 20, 40 (1948). A somewhat different type is that of 
N. Rosen, Phys. Rev. 72, 298 (1947). 


of the theory are sufficiently interesting in them- 
selves to warrant their discussion first in a 
separate paper. 

The potential at a point in space at a given 
time depends on the charge at a distance r from 
the point at a time previous by f=r (taking the 
speed of light as unity). Speaking relativistically, 
interaction occurs between events whose four- 
dimensional interval, s, defined by s?=2—7?, 
vanishes. There results, however, an_ infinite 
action of a point electron on itself. The present 
theory modifies this idea by assuming that 
substantial interaction exists as long as the 
interval s is time-like and less than some small 
length, a, of order of the electron radius. When ¢ 
is large since A(s?) = 2¢- At this means a spread in 
the time of arrival of a signal of amount of 
order a?/2t. For charges separated by many 
electron radii there is, therefore, essentially no 
effect of the modification. For the action of an 
electron on itself, however, there is a considerable 
modification. The result is to reduce the infinite 
self-energy to a finite value. For accelerations 
which are not extreme, the action of an electron 
on itself appears simply as an electromagnetic 
mass. If desired in the classical theory, all the 
mass of an electron may be represented as electro- 
magnetic. (In the quantum theory this cannot 
be done in a reasonable way as the electromag- 
netic mass comes out quite smal! under reason- 
able assumptions for a.) We have, therefore, a 
consistent classical theory which does not dis- 
agree with classical experience. 

In the remainder of the paper we formulate 
this idea mathematically, and draw one or two 
simple consequences. We then discuss a curious 
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feature of this theory. It can give a classical 
representation of the phenomena of pair pro- 
duction in sufficiently strong fields. This is of 
interest because the physical ideas may possibly 
be carried over to give a clearer understanding 
of the hole theory of positrons. 

The main result which is to be carried over to 
quantum problems is this: In any process in 
which there is no permanent emission of quanta 
one must assume the field quanta to have a 
“density” g(k,,K) in frequency, and wave num- 
ber space. This replaces the usual assumption 
that the frequency k, equals the magnitude of 
the wave number, K, and that the density in 
wave numbers K, is uniform (corresponding to 
g(ky,K) = 6(ke—K*)). The properties g(k4,K) 
ought to have are discussed more fully below. 


MATHEMATICAL FORMULATION 


It is most convenient (but not necessary) to 
formulate these ideas in the language of action 
at a distance. Hence a brief summary of that 
point of view is given here. We start with 
Fokker’s action principle that the action 


S=E me f (dada) 
+Z'eef f o(cu'\da,db, (1) 


is an extremum. Here a, represents, for u=1 to 
4, the three space coordinates and the time 
coordinate of a particle a of mass m,, charge éq. 
We shall later consider them as functions of a 
parameter a, say. The b, are corresponding 
quantities for a particle b, etc. The symbol 
XuVp MEANS Xqyg—Xvyi— Xe —Xsy3 and Sas?= (Ay 
—b,)(@,—5,). The 6 is Dirac’s delta function. 
The integrals are taken over the trajectories 
of the particles, The >>’ means the sum over 
all pairs a, b with ab. We consider varying 
the path a,(a) of particle a. Defining 
A,® (x) = es { (sat)dby (2) 
where x stands for x,, a point in space time, 
we can write (1) as 


Soo ms f (da,da,)3+0 ee f A,® (a)day. 


a ba 
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The result of seeking an extremum of this is to 
lead in the well-known way to the equations of 
motion, 


d fda, day 
ma—( ) =e. > F(a), 
dra\dro dtq>xa 


(3) 


where we can call F,,(x) the field at x caused 
by particle b. It is given by 


F,, (x) =0A, (x)/dx,— 8A, (x)/dx,. 


We have written dr,=(da,da,)! for the proper 
time along the path of a. 

Since [75 (s 22) = 405(xy— 53) 5 (x2 — D2) 5(x3— 53) 
X 6(x4—b,), where [ = (0/dx,)(0/dx,), Eq. (2) 
gives 


007A, (x) = dere, f 6(e1—b1)8(x2—b2) 


X 5(x3— bz) 6(x4—b,)db,, (4) 


which is 47 times the current four-vector of a 
point charge es. Thus F,,“)(x) satisfies Maxwell’s 
equations. But the special! solution (2) is not the 
usual retarded solution but is rather half the 
retarded plus half the advanced solution of 
Lienard and Wiechert® (since 6(#—r%) =(1/2r) 
X (6(¢+r)+6(¢—r).) Thus we may write (dots 
representing derivatives with respect to 7.4, and 
the fields being calculated at the point x,=a,), 


Mody = Cady D5 (FFO,, ret +3 F,, adv)+ (5) 
bya 


This can be compared to the usual theory which 
just uses retarded effects by writing it in the form 


Molly = Cady 


> FO), ret-t 3 De, (FO), adv 


ba all b 


— Fo, ret ]— 3 FO ys adv Fo, ret_| ’ 


(6) 


as in the paper? by Wheeler and Feynman. As in 
that paper the first term is the retarded field of 
other charges, the second term vanishes in a 
world where all emitted light is eventually ab- 


5 This use of advanced and retarded potentials is really 
unnecessary for an understanding of the modifications of 
electrodynamics which is the main point of the paper. It 
results from the aurhor’s desire to start with a principle 
of least action, for it is in this form that the transition to 
quantum theory can be made. 
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sorbed,® and the third term, depending only on 
the motion of a, is the force of radiative damping. 
Thus (1) is equivalent to (6) and thus satis- 
factorily describes the known laws of classical 
electrodynamics. There is no self-energy. 

According to the above, a particle does not 
act upon itself, as the term with a=) in the 
sum >-'a,seaeo/ +: in the action has_ been 
omitted. (Radiation resistance is pictured as 
in indirect effect of source on absorber and 
absorber on source.) The field of each particle 
must be kept separate in order to exclude, when 
asking for the force on a particle, the field of 
the particle itself. 

There is no need to do so, but it is an interest- 
ing question to try to reinstate the idea of a 
universal field. This requires that a particle be 
allowed to act on itself and the term a=5 in- 
cluded in the action sum. This leads immediately 
to an infinite self force. This difficulty can be 
eliminated if the 5(s4s”) is replaced, as Bopp‘ has 
suggested, by some other function f(sa,’) of the 
invariant s,”, which behaves like 5(s.,”) for large 
dimensions but differs for small. (We shall dis- 
cuss the properties of this function later, but as 
an example to keep in mind, consider f(s?) 
= (1/2a*)exp(—|s|/a) for s?>0, and f(s?) =0 for 
s?<0 with a of order of the electron radius 
e2/mc?.) 

We study the consequences of replacing (1) 
by the law that S is extremum if 


S=X mf (éayda,)' 
+3 LD eats i} / F(Sas?)da,dby. (7) 
ab 
The term with a=b may be written 


geet ff Flstoa)daxday! 


where a@ and a’ are two points on the world-line 


(8) 


5’ That the second term vanishes in these circumstances 
may be seen as follows. If a source radiates for a time, at 
a very long time afterwards the total retarded field van- 
ishes, for all the light is absorbed. But also the total ad- 
vanced field vanishes at this time (for charges are no 
longer accelerating and the advanced field exists only at 
times previous to their motion). Hence, the difference 
vanishes everywhere at this time and, since it is a solutjon 
of Maxwell's homogeneous equations, at all times. 


7] 
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of a. The variation problem clearly leads to 


mals cab] EF. (a) +P, ©}, (9) 


b a 
where 
PF, (x) =0A, (x)/dx,—0A,(x)/dx, (10) 


and the bar over the field quantities indicate 
that they are calculated from the f function 
rather than the 6 function. That is, 


Ri=a f f(sss*)db,. (11) 


This theory differs from the usual in two re- 
spects: A. There is an extra force h, = ead’, (a) 
on particle a depending only on the motion of a. 
This we shall study in a moment and show that 
it represents inertia. B. The fields of other 
particles are given by the curl! of a potential but 
the potential (11) no longer solves the Maxwell 
equations (4). However, since f(s?) is close to 
6(s?) this means that except for particles very 
close together nothing is changed very much. 
Thus f(@—7’) is large only when f=r is nearly 
satisfied, but for large ¢ near +r, say, f(#—?r’) 
=f(2t(t—r)) so that the function which has 
width a? in its argument s? has width a@?/2¢ in 
t~r. Thus for increasing distances from the 
source the potentials satisfy Maxwell's equations 
ever more accurately. 
The analog of Eq. (6) becomes 


M ody = aly >» (Fy, ret 
ba 


+4 ~ (FO, adv F,, ret_| 


slloé 


—3LF,, adv FO, wi Fo}, a) 


where we define (F)ree=F+4Fret—4Faay. Thus 
only the 6 part, so to speak, of the fields becomes 
retarded. It would not do to replace F by F 
throughout in (6) for then we could not deduce 
that the second term is zero at the source be- 
cause it was zero at infinity for it would not 
then be a solution of Maxwell's equation in 
empty space. The damping term is unaltered. It 
plus the self-force can be written (F)‘,, ret (see 
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footnote 5), so in practice one can write simply 


M aly = Cady oy (F)®,, ret- 
all} 


The effect of the modification in the theory 
using retarded fields is therefore to change, 
slightly, the field of one particle on another when 
they are very close, and to add a self-force h,. 

We now turn to a study of the self-force hy. 
This can be calculated directly from the for- 
mulae (10), (11) but a simpler way is from the 
action term (8). This term in the action can be 
re-expressed approximately if we assume that 
the accelerations are not too great. Only values 
of a’ near @ are important. Let us define a 
parameter along the path and say a corresponds 
to the value a of this parameter, a’ to the value 
a'=a-+e. Assuming a’ not to vary too rapidly 
with e« we can approximate saq?=(a,—4,') 
X(au—a,’) by &(da,/da) (da,/da) = é(dr./da)’. 
Likewise da,da,’ is to sufficient accuracy 
(da,/da)(da,/da)dade. Thus the self-action term 
is approximately 4e.2f Sf(¢(dra/da)*) : (dra/da)? 
Xdeda. Then calling 7=«¢(dra/da) we can write 
this as 


” f Gidea. f (da,da,)', (13) 
where we have set 
i Saee f f(?)dn- (14) 


That is, the self-action term to this approxima- 
tion represents pure electrodynamic mass. The 
term readily combines with m.fdr. for the 
mass is correctly invariant. We can go further 
and assume that originally ma. is zero and all 
mass of electrons is electrodynamic, but for 
protons this would then not be so. 

The function f(s?) is to be normalized such that 


fs aio =. (15) 
The condition (14) says the range in 7 of f(y?) is 
of order e?/y, or if uw is the electron mass, of order 
of the electron radius. The function f(s?) is 
chosen so that it is symmetrical near past and 
future light cones since any asymmetry drops 
out in the form (7) of the action. Other than 
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these conditions, there are strictly no further 
conditions on f(s’). It is convenient to assume 
f(s?) to be zero if s? is negative (space-like). It 
is also very desirable to have f(s?) fall rapidly 
away from the light cone, rather than oscillate 
indefinitely, and to have f(s?) finite everywhere. 

By taking the Fourier transform of (11), one 
can represent the field as a superposition of the 
effects of harmonic oscillators in the usual way. 
However, the oscillators corresponding to waves 
of wave number &,, ko, k3 need not have a fre- 
quency k, equal to the magnitude of the wave 
number. Instead we can take the density of the 
oscillators to be k, times, g(kuk,)dkidkodksdk, 
where g is defined for positive k, only, and is 


g(Ryky) = (1/40?) Sf (s?zy) cos(Ry (Xp — Yu) 
XK dxydxodx3dx4. 


It is a function of the invariant k,k, only. The 
ordinary case, f(s?) = 6(s?) corresponds to g(k,Ry) 
= 6(k,ky). The condition that f(s?) be finite on 
the light cone implies that g(k,k,) can be written 
in the form 


g(kuky) -{ (5(Ryuky) 5 (RyRy —X*) IGQ)d. (16) 


Here G(A) is normalized such that Jo°G(A)dA = 1, 
in view of (15). It is otherwise arbitrary, as 
f(s?) is. The \ values for which g must exist 
must be large, going up to order y/e?. 

If G is chosen as 6(A\—Ao) the resulting f(s?) 
is (for s*20) the Bessel function, AoJ1(Aos)/s- 
For large ¢, if s=(@—?r)}, this does not die off 
fast with t—r, but oscillates with phase varying 
as \o(f—7°)!. That is, it oscillates with frequency 
Ao(1 —r?/é)-? at a time corresponding to arrival 
of signals with velocity r/# and thus in quantum 
mechanics would represent arrival of radiated 
“‘particles’’ of mass fito. The free emission of 
such ‘‘particles” is removed in classical theory 
by interference among the various values of i 
if a smooth distribution, G(A), of is used. This 
is required if f is to represent say a function 
decaying rather than oscillating (see appendix). 

It appears that the quantum mechanical re- 
sult is simply this: For processes without per- 
manent radiation the oscillator density g is to 
replace 6(kyk,). The negative sign in (16) proves 
embarrassing (see appendix) if quanta of mass 
Xo can be freely radiated so a wide distribution 
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in corresponding to a monatonic f(s?) is 
preferable. As an example, for f(s?) = (1/2a?) 
Xexp(—|s|/a) find G(A) = (3a7A) (1+a?a*2)-5?, 
The electrostatic potential at a distance r 
from a stationary charge, is according to (11), 


An=ef f(@—9r)dt. (17) 
For large r, in view of (15) this is readily seen 
to be e/r. At the origin r=0 however, it is finite 
being e/-«°f(2)dt or 2u/e. This has a simple 
interpretation if all mass is electromagnetic. 
The energy released in bringing a positron and 
electron charge together and so canceling out all 
external fields is just 2u, the rest mass these 
particles have in virtue of their fields. Or put 
otherwise, the rest mass particles have is simply 
the work done in separating them against their 
mutual attraction after they are created. No 
energy is needed to create a pair of particles at 
the same place. (These ideas do not have direct 
quantum counterparts since in quantum theory 
all mass does not appear to be electromagnetic 
self energy, at least in the same simple way.) 
There may be a maximum field of attraction 
between two like charges at some separation 
since, for some functions f the force arising from 
(17) vanishes at the origin, and of course again 
at infinity. 

There remains to discuss a curious point about 
the solutions of the least action principle (7) 
with the mechanical mass term m, absent. First 
let us study the simple problem of an ordinary 
single particle of mass yu in a potential A, (caused 
by other charges) which depends only on one 
coordinate x. Call the time #, and use this for 
the parameter a. The action is 


Sau fay arbe f dt. (18) 
Now suppose the potential A, is zero outside a 
small band in x say |x| <b/2 (potential barrier) 
and that it is large positive, and constant within 
the region. Consider, in Fig. 1, the paths from a 
point 1 to a point 2 which make S a local maxi- 
mum. A typical solution is the solid line which 
is kinked out of the straight line so as to increase 
the time integral of A,. This represents a par- 
ticle moving from 1 rapidly toward the barrier, 
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Fic. 1. If two points 1, 2 are separated by a high po- 
tential barrier, there are two paths which make action an 
extremum, One (solid line) represents passage of a fast 
electron. The other (dotted line) has a section reversed in 
time and is interpreted as the effective penetration of the 
barrier by a slow electron by means of a pair production 
at Q and annihilation at P, section PQ representing the 
motion of the positron. 


entering the region of high potential, losing 
energy and thus going slower in this region. The 
high velocity is regained on passing out of the 
region to 2. Slow particles cannot penetrate 
the barrier. 

But there may be another local maximum. 
Consider the path 1PQ2. In the interval PQ the 
proper time integral must be taken positively 
as can be verified from a study of the derivation 
of (13). Now moving the point P upward by At 
might be expected to increase the action by over 
2yAt because the length of Pi and PQ are both 
increased. On the other hand, the integral {A 4d 
is now negative and if A, exceeds 2u such a 
curve may be a local maximum. Thus for A, 
greater than 2y there is a new way that slow 
particles can penetrate the barrier. This is a 
classical analog of the Klein Paradox. 

How would such a path appear to someone 
whose future gradually becomes past through a 
moving present? He would first see a single 
particle at 1, then at Q two new particles would 
suddenly appear, one moving into the potential 
to the left, the other out to the right. Later at 
P the one moving to the left combines with the 
original particle at 1 and they both disappear, 
leaving the right moving member of the original 
pair to arrive at 2. We therefore have a classical 
description of pair production and annihilation. 
The particle whose trajectory has its proper 
time opposed in sign to the true time ¢ (section 
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PQ) would behave as a particle of opposite 
sign, for changing the sign of db, in (7) is equiva- 
lent to changing the sign of e,. This idea that 
positrons might be electrons with the proper 
time reversed was suggested to me by Professor 
J. A. Wheeler in 1941. 

The field at x= +0/2 is infinite. If it is finite 
the action (18) does not show such a local 
maximum, the sharp corner at P becoming a 
cusp which can go indefinitely into the future. 
On the other hand, if the correct self-force from 
(7) is used instead of the approximation (13), a 
path reversal again becomes a possibility. It 
is only necessary that the field exceed a critical 
value, namely, that maximum value of attraction 
of two unlike particles mentioned above. This 
field represents a potential of 2u in a distance of 
the order of an electron radius and must be as 
great as this to get the pair of newly created 
particles apart over the potential! barrier of their 
mutual attraction. (The actual field required to 
produce pairs in quantum mechanics is 137 
times weaker. One might ascribe this to a quan- 
tum mechanical penetration of the potential 
barrier over a Compton wave-length.) 

There are many interesting problems pre- 
sented by these ideas. For example, will pairs 
be produced ad infinitum by the field, or only 
to that extent that we can guarantee that the 
positrons will be annihilated by electrons in 
the future? Again, in a weak field can a large 
number of pairs be created which separate 
slightly in the field (which is insufficient to tear 
the two apart) and thus produce a large polariza- 
tion of that field? It is hoped that an application 
of these ideas to a study of positron hole theory 
will appear in a future paper. 

I should like to thank Professor J. A. Wheeler 
for inoculating me with many ideas without 
which this work would not have been done. 


APPENDIX 


The difficulties in a theory such as the one 
presented here have been discussed by many 
authors. A very brief discussion of them will be 
given in this appendix. 

The first point is that the action S defined in 
(1) or (7) is infinite and meaningless because of 
the infinite extent of the integrals along the path 
of the particles. The principle of extreme action 
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which we mean to apply can be more rigorously 
defined as follows. Consider any given variation 
in paths 6a, such that 6a,—0 as a—+o, Then 
define 6S as the limit of the quantity 46S calcu- 
lated from (7) with this path variation, the limit 
being taken as the range of integration passes to 
infinity. Then the law of motion shall be 6S=0 
for all variations which satisfy 6a,—O0asa—-+o., 
The equations of motion (9) are then conse- 
quences of the action principle, of course, but 
not all solutions of these equations satisfy the 
principle of least action as defined here. There 
are certain runaway solutions of the equations 
of motion, such as those discussed by Dirac? in 
the case of Eq. (6), in which the kinetic energy 
and momentum of a particle increase exponen- 
tially with time. These are excluded for they 
do not satisfy the principle of least action. 

Bopp‘ has studied in great detail the conse- 
quences of equations of motion. However, he 
assumes that the function f acts only at retarded 
times. He finds that the radiation resistance of 
an oscillator decreases below the normal value 
with increasing frequency of the oscillator. How- 
ever, if an oscillator is enclosed in a large, light 
tight box the fields at the walls of the box are 
effectively unchanged by the use of f rather than 
6. (We are assuming that f decays and does not 
oscillate. Below, we discuss the situation if f 
oscillates.) Hence the energy absorbed by the 
walls will not, apparently, decrease with the 
increase in oscillator frequency and the radiation 
resistance will not keep up with it. In the modifi- 
cation described in this paper, in such a box, 
only the 6-part of f is to be retarded. The radia- 
tion resistance has its normal value at all fre- 
quencies, and the energy lost will all be found 
eventually in the walls of the box. 

The conservation of quantities like energy 
(and momentum) can be demonstrated directly 
if a theory starts from a principle of least action, 
the form of which action is invariant under a 
change of origin of the time (and space). For the 
action (7) consider the quantity 


fap | mabe. ~ Aa 


a “At au(ao) 


-> x cats ff 200,—b0F (sav?)da,db, (19) 
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where f' (x) =df(x)/dx. The points ao, Bo,. ..are an 
arbitrary set of points, one chosen on the world- 
line of each particle. In virtue of the equations 
of motion (9) the derivative of g, with respect to 
any of the a's is zero. This is a generalization 
of the usual conservation of energy. Ordinarily, 
we would choose all the ao such that all a,(ao) 
are equal, and find g, is independent of this 
common value of ay. The energy is seen to 
consist of a self-energy, an energy caused by the 
presence of the potential on each particle, and 
(since this last would count each interaction 
twice) a further correction to interaction energy. 
This is described as a line integral over the paths 
of the particles, but since one point is in the 
future and the other in the past, the actual 
range of integration does not extend beyond the 
time during which 5 could interact with a at ap 
and that @ could interact with 5 at Bo. This is 
the way in which energy which is usually spoken 
of as being in the field is represented in a theory 
of action at a distance between particles. Since 
it is an integral only over a limited range, the 
energy of motion of the particles is conserved in 
the long run. (It is easy to generalize (19) to the 
case that paths may reverse themselves in time.) 

We now consider the situation in which the 
function f oscillates rather than decays. If, as 
was discussed by Bopp‘ and others (e.g., 
B. Podolsky and P. Schwed‘), f is replaced by a 
Bessel function oJi(Aos)/s, the theory corre- 
sponds as we have seen to that of interaction 
through ordinary ‘‘quanta’’ minus those corre- 
sponding to a mass fio. The f function does not 
appear as a pure 6-function at large distances, 
but another component appears if the frequency 
of the source exceeds Ao. Thus, a source at high 
frequency w emits waves of two wave-lengths, 
light of wave number K =w and ‘Ao-quanta”’ of 
wave number K = (w?—).”)?. Again Bopp’s equa- 
tions (using retarded potentials only) show that 
the radiative resistance force on a dipole oscil- 
lator of amplitude x, frequency a, is constant at 
22w3x/3c3 for w<Ao and falls off as w exceeds Xo, 
as (2e%x2/3c3) [3 — (w?-+ $0?) (w? — Ao?) *], remain- 
ing positive, however, for all frequencies. The 
decrease at higher frequencies must correspond 
to a negative contribution to radiation resistance 
accompanying the emission of ‘‘\o-quanta.”’ That 
is, the Ao-quanta behave as though they had 
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negative energy. That this is so and that it 
results in fundamental difficulties may be seen 
from a few examples. If through interference the 
rate of emission of ‘‘Ao-quanta”’ can be enhanced 
relative to the rate for the ordinary light quanta, 
a net negative radiation resistance will result. 
For example, a set of two vertical dipoles oscil- 
lating in phase (at frequency w= 29/34), sepa- 
rated horizontally by one-half the wave-length 
of light, and one-fourth the wave-length of the 
Xo-quanta of the same frequency, shows a nega- 
tive net radiation resistance. It would oscillate 
with ever-increasing amplitude, the large emis- 
sion of negative energy Ao-quanta supplying the 
increase in energy of the oscillators and the 
energy of the light quanta emitted. Again a beam 
of Ao-quanta passing through a medium con- 
taining damped (energy-absorbing) oscillators 
increases in amplitude. The wave of Ao-quanta 
scattered by the oscillators in the forward direc- 
tion which ordinarily interferes destructively 
with the incident wave, in this case has a re- 
versed sign and enhances the incident wave. 
(The light scattered forward has the incorrect 
wave-length to make an appreciable effect by 
interference.) A beam of \o-quanta can be sepa- 
rated from light of the same frequency by having 
the radiation from a source of a given frequency 
impinge on a diffraction grating of scattering 
centers. The Ao-quanta and light quanta will 
then be scattered in different directions as they 
have different wave-lengths. 

What results if instead of using only retarded 
waves for \o-quanta, we start from the least 
action principle and analyze the situation of a 
source enclosed in a box? Then the derivation of 
Eq. (12) is still incomplete as (F),, still contains 
both advanced and retarded components (of 
do-quanta) at large distances. We could now split 
off the advanced parts for Ao-quanta just as we 
did for light. The resulting equation is just 
that used by Bopp, namely all retarded inter- 
actions but negative contribution of Ao-quanta 
to radiation resistance, and therefore leads this 
time to conservation of energy but to diverging 
solutions. Such diverging solutions are, as indi- 
cated above, excluded by the least action prin- 
ciple so this form of the equation is not con- 
venient. Non-divergent motions do exist. To see 
this it is better to split off the retarded part of 
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the Ao-quanta. What results is that light goes 
by retarded waves, Ao-quanta by advanced 
waves,’ and the radiation resistance of both 
contribute positively. Thus an accelerating 
charge will emit light, but it is predestined that 
negative energy A9-quanta were coming toward 
it to be absorbed, still further increasing the 
radiation resistance. This avoids the divergent 
solutions only to predict observable advanced 
effects. 

For these reasons it is better to restrict one- 


™This may be understood in that, as indicated above, 
the energy-absorbing walls of the box absorb retarded 
light waves, but cannot be presumed to absorb retarded 
Ao-quanta. Instead, in fact, they spontaneously emit such 
waves (warming up in the process) and non-divergent 
solutions result only if they emit just exactly the Ao- 
quanta which can later be aheoebed by the accelerating 
charge at the center. 
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self to the case of a decaying f-function (dis- 
tribution of A) for which a consistent theory 
can be made. Then the modifications of classical 
electrodynamics will only appear at very small 
distances from a charge. On the other hand, 
these distances are well within the Compton 
wave-length so that modifications caused by 
quantum mechanics would in any case appear 
before the ones here discussed. There is, there- 
fore, little reason to believe that the ideas used 
here to solve the divergences of classical! electro- 
dynamics will prove fruitful for quantum elec- 
trodynamics. Nevertheless, the corresponding 
modifications were attempted with quantum 
electrodynamics and appear to solve some of 
the divergence difficulties of that theory. This 
will be discussed in a future paper. 
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II.C Quantum Electrodynamics 


Because Feynman’s unconventional approach to QED did not use quantum field theory, he 
found it difficult to convince his peers of its validity. When he outlined his views to them 
at. a small elite gathering, the Pocono Conference of March 30-April 1, 1948, as he later 
recalled, his lecture was “a hopeless presentation.” He was treated as an alien: “I had too 
much stuff. My machines came from too far away.”! He thus thought that it would be best 
first to describe his important physical results, and to show how they could be obtained by 
more standard methods. In this way he hoped to motivate his colleagues to make the effort 
to follow his more powerful new methods, which he would present later. 

In this section we group the papers containing physical results in QED, some already 
published by others, but some obtained correctly for the first time. The papers that detail 
Feynman’s new mathematical tools and derivations will be given in Part III below. 

Feynman began this acclimatization process by publishing paper [9], which includes a 
quantum-mechanical version of the relativistic high-frequency cutoff that he had introduced 
for classical electrodynamics in paper [8]. This method, an example of what would later 
be called “regularization,” made the divergent integrals of QED finite — except for the so- 
called vacuum polarization integrals, which would require a stronger form of regularization.? 
Using the “old-fashioned” perturbation theory of Dirac, Feynman showed that his cutoff 
procedure led to the same results for the electromagnetic shift of energy levels of hydrogen 
(Lamb shift) and the radiative corrections to potential scattering as had been published 
earlier by others, including Bethe, Schwinger, and Weisskopf. However, Feynman’s results 
avoided the subtraction of infinite integrals, using only finite ones that he showed were very 
insensitive to the value chosen for the cutoff. 

Paper [12] treats the “motion of electrons and positrons in given external potentials.” It 
introduces the overall space-time view, emphasizing the solutions of the Dirac relativistic 
electron equation in the integral form, the Green’s function or “propagator,” which is the 
probability amplitude for the electron to pass from one point in space-time to another. 
The allowed paths can propagate forward or backward in time, the time-backward paths 
being interpreted as positrons.? Thus here are introduced the famous trademark “Feynman 
diagrams.” In an appendix, Feynman derives his formulation from the (misnamed) “second 
quantization” theory of the Dirac electron—positron field. 

Paper [13], submitted a month later than paper [12], appeared in the same issue of the 
Physical Review. A continuation of the first paper, it uses the same notation and diagrams, 
but now it attacks the general problem of QED, involving, besides the external potentials and 
photons, also the fields of the charged particles themselves, involving both real and virtual 
photons. This paper constitutes a textbook of Feynman’s powerful methods for solving real 
problems in QED, and describes his propagators and their formulation in momentum space. 
He discusses the electron self-energy problem, the convergence of processes with virtual 
quanta, the radiationless scattering problem, and the Lamb shift. His footnote (numbered 
13 — appropriately, as he remarked) on the history of the calculation of the relativistic Lamb 


'Feynman interview with S.S. Schweber, 1984. See Schweber’s QED, p. 436. 

2W. Pauli and F. Villars, “On the invariant regularization in relativistic quantum theory,” Rev. Mod. Phys. 
21, 434 (1949). 

’This provocative notion is a major technical advance, but it has engendered as much (mostly meaningless) 
speculation about its “meaning” among the uninformed as has Hinstein’s “relativity.” 
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shift apologizes for delays caused by the disagreement of his slightly wrong result of early 
1948 with other (correct) calculations. He also calculates the vacuum: polarization part of 
the Lamb shift, using a regularization method that he credits to Pauli and Bethe (the Pauli— 
Villars paper had not yet appeared). In footnote 18 he writes: “It would be very interesting 
to calculate the Lamb shift accurately enough to be sure the 20 megacycles expected from 
vacuum polarization are actually present.” This shows that he had not yet fully accepted 
the need for a quantum field theory, as opposed to delayed action-at-a-distance! 

Paper [18] is a calculation of the radiation corrections to the Klein—Nishina formula for 
Compton scattering. Even when making full use of the Feynman calculational tricks, it 
was a long and tedious calculation. However, the scattering of a photon by an electron 
can be regarded as the fundamental interaction described by QED, and unlike the other 
processes mentioned (Lamb shift, electron anomalous magnetic moment), it is not a single 
quantity but a complete differential cross-section, and it does not make use of an approximate 
nonrelativistic potential, like the radiationless scattering and the Lamb shift. Thus it is an 
especially appropriate testing ground for QED, as well as being an important observable 
effect at higher electron energies, which would be employed to analyze experiments in the 
decades that followed. 

As stated in the introduction, paper [45] is a review of QED as of 1961, included here for 
its originality and stimulating language. Feynman concludes it by referring to his “long-held 
strong prejudice that [QED] must fail significantly (other than being simply incomplete) at 
around 1 GeV virtual energy.”* Furthermore, he wrote, “I still hold this belief, and do not 
subscribe to the philosophy of renormalization.” 
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A relativistic cut-off of high frequency quanta, similar to that suggested by Bopp, is shown 
to produce a finite invariant self-energy for a free electron. The electromagnetic line shift for a 
bound electron comes out as given by Bethe and Weisskopf’s wave packet prescription. The 
scattering of an electron in a potential, without radiation, is discussed. The cross section 
remains finite. The problem of polarization of the vacuum is not solved. Otherwise, the results 
will in general agree essentially with those calculated by the prescription of Schwinger. An 
alternative cut-off procedure analogous to one proposed by Wataghin, which eliminates high 
frequency intermediate states, is shown to do the same things but to offer to solve vacuum 


polarization problems as well. 


“TCHE main problems of quantum electro- 
dynamics have been essentially solved by 
the observations of Bethe! and of Weisskopf? that 
the divergent terms in the line shift problem can 
be thought to be contained in a renormalization 
of the mass of a free electron. That this principle 
applies as well to other problems was demon- 
strated by Lewis? in analyzing the radiationless 
scattering of an electron in a potential. Am- 
biguities which remained in the subtraction 
procedures are removed by Schwinger.** He 
formulated, in a general way, which terms are to 
be identified in a future correct theory with rest 
mass, and hence should be omitted from a cal- 
culation which does not renormalize the mass. 
These results are remarkable because they solve 
the problem without the addition of any new 
fundamental lengths or dimensions. 

The solution given by Schwinger does, how- 
ever, assume that in some future theory the 
divergent self-energy terms will be finite. There- 
fore, it is of interest to point out that there is a 
model, a modification of ordinary electrody- 
namics, for which all quantities automatically do 
come out finite. With this model the ideas of 
Bethe, Oppenheimer, and Lewis and Schwinger 
can be directly confirmed. 

The model results from the quantization of a 
classical theory described in a previous paper.® 


1H. A. Bethe, Phys. Rev. 72, 339 (1947); 73, 1271A 
(1948). 
( ae and V. Weisskopf, Phys. Rev. 73, 1272A 
1948). 

3H. W. Lewis, Phys. Rev. 73, 173 (1948). 

4J. Schwinger, Phys. Rev. 73, 415A (1948). 

5 R,. P. Feynman, Phys. Rev. 74, 939 (1948). 


In this paper we describe only the results for 
processes in which only virtual quanta are 
emitted and absorbed. The problems of per- 
manent emission and the position of positron 
theory must be more completely studied. It is 
hoped that a complete physical theory may be 
published in the near future. Lacking such a 
complete picture, the present paper may be looked 
upon merely as presenting an arbitrary rule to 
cut off at high frequencies in a relativistically 
invariant manner, the otherwise divergent in- 
tegrals appearing in quantum field theories. For 
electrodynamics the rule is to consider the 
(positive) frequency w and wave number k of the 
field oscillators as independent and to integrate 
them over the density function g(w?—k®)dwdk 
where 


g(w?—k?) = i [5(02— bY) 


~ 5(w?—k?—d?*) JG(A)dd. (1) 


Here 6(x) is Dirac’s delta function and G(A) is 
some smooth function such that /o"G(A)dA=1 
and for which the mean values of \ which are 
important are of order of the frequency 137 
mc?/h, or higher. Ordinary quantum electro- 
dynamics replaces the function g(w?—}?) by 
5(w?—k*?). According to (1), the density g is not 
everywhere positive.* Therefore, the model is 
essentially that due to Bopp.® 

The model therefore conta‘ps an arbitrary 
function and the numerical results depend on the 


®F. Bopp, Ann. d. Physik 42, 573 (1942). 
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form of G(\). However, the only term that 
depends seriously (logarithmically) on the cut-off 
frequency is the self-energy, which can be used 
to renormalize the electron mass. After this is 
done, the remaining terms are nearly independent 
of the function G(A). 

We shall illustrate these points by studying 
the particular examples of self-energy and radi- 
ationless scattering. We shall then discuss an 
alternative cut-off procedure in which the density 
of electron states is cut off rather than that of 
the quanta. This promises to solve problems of 
vacuum polarization which are not touched by 
the former procedure. 


SELF-ENERGY 


The transverse self-energy of a free electron, of 
mechanical mass pu, in state of momentum Po 
energy Eo=(u?+P,*)! is given to the first order 
in e? by the second-order perturbation theory, 
using the one-electron theory of Dirac, by 


e dk (Olas|f)(f|@:|0) 
pe oe Se 
sia are an ia [= E,—Eyt+k 
0 ay ay 0 
ay! les|f)(f|a:|0) 
= Bo Bat 


Here the intermediate state f arises from the 
initial state through emission of a quantum of 
momentum k and of energy k=|k]| (the velocity 
of light is taken as unity, as is Planck’s constant). 
Thus in the intermediate state the electron has 
momentum P,;=P)—k and an energy of mag- 
nitude E;= + (u?+P,*)? but which may be either 
plus or minus in sign. The sums indicate the sum 
over all such intermediate states (actually just 
two) for each sign of the energy. The terms for 
positive and negative energy have been separated 
and the sums are written >_, and >-_ for these 
two cases. The (f|a@;|0) are the matrix elements 
of Dirac’s a-matrices, the sum on 7 being over 
the two directions of polarization of the quanta. 
We shall henceforth write the integral dk/k over 
k space by its equivalent 2 {'dwdks5(w?—k?), the 
integral being over all positive w, and all wave 
numbers k. We shall also write w for k in the 
energy denominators as we shall later wish to 
distinguish the energy of a quantum and the 
magnitude of the momentum change that its 
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recoil represents. We may further simplify the 
expression by the use of the well-known pro- 
jection operators: 


Apt = (Esty) /2Es= (Esta -PytBy)/2E;. 


According to the theory of holes, the last 
term, the transition to negative energy states, is 
to be left out; such transitions are prevented 
because the negative levels are already occupied. 
On the other hand, in the vasuum, electrons in 
state of energy — Ey; could make virtual transi- 
tions to positive energy state Eo. This is now 
prevented by the presence of an electron in the 
state Eo, so that, relative to the vacuum, the 
transverse self-energy is 


e 
AE=—-—> f dates (ute 
Qn? i 


| (O|aAsta;|0) (0| aAsa,|0) 
Ej-Extw  EytEvtw } 


The treatment of the longitudinal self-energy 
is usually different, for the longitudinal oscil- 
lators are first eliminated from the Hamiltonian, 
their effect being the term e?/ro9 where foo is the 
meaningless distance of the electron from itself. 
These terms must be expressed as integrals over 
oscillators and combined with (3) before the 
change suggested by (1) is to be performed. An 
additional point of confusion is that the longi- 
tudinal elimination assumes the intermediate 
states to form a complete set as they do in (2), 
but the situation in (3) is not so clear. For- 
tunately, all these points may be most easily 
circumvented by simply not eliminating the 
longitudinal oscillators from the field Hamil- 
tonian at all. One need simply to specify that the 
sum on 7 in (3) now be interpreted to mean the 
sum over each of three perpendicular space 
directions minus a term for the time direction. We 
may write >-;a;Aa;=a-Ae@—A, which is a 
relativistic combination since ay=1. One does 
not need to be concerned about the gauge con- 
dition in a problem in which all quanta are 
virtual, for the quanta are created by a charge 
which is conserved. This solution automatically 
insures the gauge condition just as the Lienard 
Wiechert classical solution of the Maxwell 
equations will automatically satisfy the gauge 
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condition if the charge which produces the 
potential is conserved. 

With this convention for >>;, Eq. (3) repre- 
sents the total self-energy. It ts easily calculated. 
The numerator of first term may be written as 
1/2E; times 3; (0| a:(Hy+E;)a;|0) where H; is 
e-P;+8p. Now since >>; aa;,=+2, >; a8a;= 
—48, and >; a,ea;= —2a, this becomes 


— 2(0| —E;+26u+e-P,;|0). 


The diagonal elements of 8 and @ for the state 
0 are p/Eo and P)/Ep, respectively. 

The change in energy AE can, since the 
momentum is given, be represented as a change 
Au in rest mass of the electron. In virtue of the 
general relation E?=y?+ P®, the relation between 
these quantities is uwAu=E,AEo. Thus we find, 
treating the sum of negative energies in a similar 
manner, 


2 2u?—EyE;+Py-P, 
ei duds(st— 1) | 
an E;(E; —Eo+w) 
Se is 
E,Ej;+Eotw) } 


The integral diverges logarithmically and Ayo 
defined here is meaningless. If the 5(w?—k’) is 
replaced by g(w?—k?) defined in (1), the result is 
finite and invariant (i.e., does not depend on the 
momentum P» of the electron). 

How this comes about may be seen by cal- 
culating the integral in (4) for 


g(w?—k®) = 8(w?— h®) — 5 (wo? —k?— 2°) 


and reserving an integration on d until later. The 
integral (4) will converge with this g(w?—k*), but 
it is convenient to divide it for purposes of cal- 
culation into the difference of two diverging ones. 

This is legitimate providing the divergent 
integrals are first both computed over the same 
finite region of k space, the difference taken, and 
then the region allowed to pass to infinity. 
Therefore, we shall define Ayo by (4), in which 
we choose the region arbitrarily to be-first over 
all (positive) w and then over a sphere in k space 
of very large radius K. Likewise Ay is defined as 
expression (4) with 6(w?—k?—)*) replacing 
6(w?—k?), and the integration taken over the 
same region. 
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The true self-mass is therefore 


Am) JG)dr. (5) 


Koa 


Ap= f [Lim (Apo— 
0 


We may now calculate these integrals, starting 
with Apy. Since Py-P;=Po- (Po —k) = Ey? — 2? 
—Po-k and EP=Ee+k?—2P o- Py, the Po-P; 
term in the numerator of the first term may be 
eliminated, the numerator becoming 


(Ep +E? — k*) — EoE stu? = p?+ 3 (wo? — k’) 
+4(Fy—E,—w)(Ey;—Eo+w). 


Thus the first term in Avy becomes 


w?+4(w?—k?) 
f ite pda 
E;(E;— Eo+) 


44 f 5(w?—k?—X2)deadk(E, -Ey—w)/E;. (6) 


Adding the corresponding second term which 
differs from the first only in the sign of Eo, and 
performing the integral on w (which requires 
simply division by 2w), we find 

1 


2? /e*) pAu = (u2 py {| ————. 
(2m?/e?) Aur = (u?+ $d”) (E, bu)? EB, 


Ejtwdk 1 frdk 1 pf dk 

| Fy pee 

E; w 2 wo 2 E;, 
where w=(k?+2?)! and the integration is to be 
taken over a sphere of radius K in k space. The 
first and, obviously, the second integrals turn 
out to be invariant; the third is not, but its 
contribution will cancel out on taking Ayo—Apa 


as it does not depend on X.” The result of the 
integrations® is, dropping terms of order 1/K and 


7Pais has suggested that one subtract from Apo the 
—Ap, that one gets not from electrodynamics but from 
the scalar f field (for which 8---8 replaces 2; a+ +a). 
Proceeding in this way the integrals f"dk/Ey do not appear 
with the same coefficient. Therefore, although this pro- 
cedure leads to a finite rest mass it is not invariant in the 
sense here, that the limits of k space integration can be 
taken to be independent of the momentum of the electron. 
A, Pais, Kon, Ned. Akad. v. Wet. Verh. D1, 19, 1 (1947). 

§ The first integral may be performed in the following 
manner: First integrate over the directions in k space at 
constant magnitude k. Only E, depends on the direction of 
k and one may therefore replace the solid angle integral 
by one on Ey. The limits of Ey are E4=(w?+(Potk)?)! 
and E_=(y?+(Po—k)*)} but both terms may be considered 
together as one if the integral on & be extended from -K 
to K instead of 0 to K. To integrate this on k, substitute 
the variable x=E,+w~—Ey and (the algebra is long) 
integrate by parts to reduce it to elementary integrals. 
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smaller: 
(3/e?) wAuy = (uw?+4A2) (Ny +u2Xa(u, uw) J 
+3LK?—)?(In(2K/A) — 3) J 
—3[K?—4Po— w?(In(2K/u) — 4) ], 


where 


Ny = No—[¥/0?—w*) ]In(X/u), (7a) 


with No=In(2K//z) —3, and the quantitv X y(u, u) 
is finite as Ko, It is given by setting wo=y in 
the complicated expression 


Quo X(t, oo) = ((A?— w? — 0”)? — 47440”)! 
NPE pw? — Mo? + ((A? = w? — Wo”)? — 40”)! 
n 
2ru 


2d? U0" 
+ (1+ mt 
IN 


u 
) In--tys?. (7b) 
—p? d 
Thus Xo(u, ») =1/2y? and for A large compared 
to wu, Xy(u, ») =1/4d. Hence 


—(wW+5)uXr(u, w), (8) 


which is independent of K (in the limit K3&). 
If the important values of \ are much greater 
than yu, we find approximately (to terms of order 
(u/d)?) 

Ap=n(e?/m) (3 In(no/u) +3], (9) 


where 


ince f IndG(a)dd. 
0 


Judging from the classical case we would have 
expected to take Xo of order 137u, for then all 
mass would be electromagnetic. But Ay here is 
too small for this to represent a real possibility. 
The experimental electron mass m is of course 
B+Ap. 

The value of \ would have to be of phe- 
nomenal size (~e!37) before Au can represent a 
sizeable fraction of the experimental mass. How- 
ever, to go to the limit of the conventional elec- 
trodynamics, Ao should be taken as infinite. 
Then the self-energy diverges logarithmically in 
the manner found by Weisskopf.® 


9V. Weisskopf, Phys. Rev. 56, 72 (1939). 
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The emission and subsequent absorption of a 
quantum acts similarly to the effect of a change 
in mass not only on the diagonal matrix element 
which we have just calculated, but on non- 
diagonal elements as well. Consider that the 
state appearing on the left of all the matrices in 
(3) were arbitrary, say x. Then the numerator of 
the first term can be expressed, as we have seen, 
by (—1/E,)(x| — Ey +28u+e: P;|0). The second 
term can be expressed similarly. The two terms 
can be combined so that the whole expression in 
brackets in (3) can be written 


9 (x | — E,Eo+ (E+) (2Bu+a-Po—ea-k) | 0) 
E,((Ey-+w)?— Eo’) 


(10) 
This expression may be multiplied by 
5(w?— k?—d*) 


and integrated with respect to w and over a 
sphere of radius K in K space. We make use of the 
following integrals which can be directly verified : 


f 1 E;yta 
(B;+0)*?— Ey? Ey 


5(w? —k? —d*)dudk/a 
= Ny+ 40? X x(n, bo), 
1 
f eh (Sead 
(E;+w)?— Eo? 
= FNy+4(u?+ wo?— A?) Xa(u, wo, 
f k E;+ 
(Ey+w)*? — Eo? E; 
=2P ol a+ Nat (M+ mo? — w7)XX(u, bo) J. 


The integrals have been calculated under the 
assumption that Eo’=y.?+ P,*. In our application 
we should take wo=y. The quantities N, and 
X (u, wo) are given by (7a), (7b). (The extra 
Parameter yo is helpful in obtaining other 
integrals, useful in the radiationless scattering 
problem, by differentiations with respect to the 
various parameters under the integral sign.) 

The result of integration (10) with the density 
5(w? — k®) — 5(w? — k? — 22) is therefore 


(e?/m) (x| —Eo(3(No— Ny) +3 — (u?— 3A*) Xd) 
+ (2Bu+e@-Po)(No—Ny+3— 1? X)) 
— $a-Po(No— Na—d?X)) |0). 


(11) 


® 
5(w?— k? —)*)dwdk/a 
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Now the energy of state 0 is Ey so that 
e-P)>=H)— By is equivalent to Ey—By, since it 
operates on state 0 (no implication about state x 
is involved). Making this replacement, all the 
terms in Eo are seen to cancel and the result is 
simply 


(x|8|0)-(Avo— Ana), (12) 


where Ayo— Au, is given in (8). On integrating 
over G(A)dd then we find (x|8Au|0). But this is 
just the perturbation element which would result 
from a change of mass by Ap in the Dirac 
equation. 

We may use this result to show that the level 
shift for an electron in a bound state given in the 
present theory will be essentially that given by 
Weisskopf and Bethe according to their so-called 
wave-packet method. The change in energy of 
our electron in a bound state may be calculated 
in a straightforward manner according to the 
present formulation. One would simply start 
with Eq. (2) but with the wave functions and 
energies for states 0 and f being appropriate for 
the potential by which the electron is bound. 
Then one would integrate over g(w?—k*) rather 
than 6(w°— *) and obtain a definite finite result. 
The result would show a fairly large change in E» 
depending logarithmically on X. 

A good part of this change could be accounted 
for as simply due to the change in Ep that would 
occur if the mass of the electron were altered 
from p to m=yu+Apu. We can define the true 
term shift, then, as the complete change in Eo, 
less Au(OEo/du), the change due to using yp 
instead of m in computing the energy with 
radiation absent. But 0 o/du is by perturbation 
theory the expected value (Wo*|8|Wo) of 8 for the 
state Wo in question. From the result (12), how- 
ever, this is also equivalent to computing the 
self-energy of a wave packet wo, assuming the 
electron as free. But Bethe! and Weisskopf?* 
compute their term shift by just this prescription: 
the total effect less the self-energy of the free 
packet. The only difference here is that we would 
compute the term shift integral on g(w?—k’) 
rather than 6(w?— k). But since the integral con- 
verges either way, the difference between the 
two results is very small, being of order of 
(u?/Xo?) times smaller than the result. 
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RADIATIONLESS SCATTERING 


We can study the radiationless scattering 
problem in a similar manner, This problem is the 
correction to the scattering by a first-order 
potential due to the possibility of emission and 
absorption of a virtual quantum. For example, 
this emission and absorption can occur at any 
time previous to the scattering. (It would, in this 
case, be nearly equivalent to a change in mass 
in the wave function of the electron arriving at 
the scatterer.) There will be a large change in 
cross section, which would be expected as the 
result of a change in mass of the electron plus a 
smaller change caused essentially by emissions 
previous to and absorptions subsequent to the 
scattering. As in the case of the self-energy in a 
field and, in fact, in all such problems, we will 
really be interested in those effects of radiation 
over and above that resulting from the change in 
mass. It is, therefore, simpler to compute the 
difference between the desired quantity calcu- 
lated with no radiation and electrons of mass m, 
and the same quantity computed with the pos- 
sibility of a virtual quantum emission and ab- 
sorption with an electron of mass yp. This dif- 
ference, which we shall call the radiative cor- 
rection, can be looked upon as the result of per- 
turbation due to the addition to the Hamiltonian 
of both the radiative interaction terms and a 
term — Ap. The latter term can, as we have 
shown, be represented by the integral over oscil- 


lators of 
aAyta; asAys-ay 


-2/( es ) (13) 
+ \Ey;-Eota Ey+Eoto 


when acting on a free electron state of positive 
energy Ey and momentum Po. When acting ona 
state of negative energy — Ep», the term can be 
shown in a similar manner to be the expression 
(13) with the sign of &) changed in the de- 
nominator, 

Terms like these are just the ones that 
Schwinger! thought should be omitted from the 
Hamiltonian if one wishes to get meaningful 
results, so that the present model agrees with 
Schwinger’s prescription. 

When this proces is applied to the scattering 
problem to obtain the radiative correction to 
the matrix elements, we are left with several 
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residual terms. First, the emissions and absorp- 
tions previous to scattering are not exactly 
equivalent to a change in mass. If the emission 
occurs too close (in time) to the scattering, the 
absorption must occur in a restricted time, rather 
than at leisure as for a free electron forming BAu. 
The correction to the matrix element (in the 
theory of holes) for this is proportional to 


(2 | VA itasAstas| 1) 
(Ey+o—E,)* 


HE 
t 


(2 VA, taAs-ai| 1) 


(14) 
(E;+o+E£;)? 


9 2 
a 


We assume the potential V (vector or scalar) 
depending on position like e‘4-R and time like 
e—i@t induces transitions from a state 1 of mo- 
mentum P,, energy E;, to the state 2 of momen- 
tum P,=P,+q, energy E,=E,+0=(w+P,’)}. 
The operator V is just 1 for scalar potential, a; 
for vector potential in x direction, etc. The term 
(14) represents only that contribution due to a 
quantum of momentum k, frequency w. We 
expect later to integrate over w and k, times 
g(w?—k*), We. put Py=P,—k, E;=(w?+P/)}. 
This term can also be regarded as due to the 
second-order normalization correction in the 
ordinary perturbation theory on the incoming 
wave function. There is a corresponding cor- 
rection for the final wave function resulting from 
virtual quanta emitted and absorbed after the 
scattering: (P,=P.—k, E,=(w+P,’)}). 


: (2|aAgtasAgt V1) 


"2 (E,+w—E)? 


(2| aiAg-aiAzt V|1) 
(E,twtE:)? — 


bole 


2X (15) 


All the effects of BAy are now included. The 
remaining terms are those for which the potential 
scattering occurs between the emission and ab- 
sorption. They may be worked out as by 
Dancoff® (except that we include the longitudinal 


“S. M. Dancoff, Phys. Rev. 55, 959 (1939). 
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waves by summing 7 from 1 to 4). They are 


(2|aiAgt VAstax:| 1) 
ae 
i (E,+a—E,)(E,+o— E2) 
(2| a:Ag~ VAs~ a: | 1) 
—_— SO) 
i (Es+o+£,)(E,+0+ Ep) 
and 


(2 | a;Agt VAs-a;| 1) 
7 (E,-+w—Es)(E;to+E)) 


2w 
x{14+——__] 
BjAE,—Ej45; 
(2 | a;Ag— VAsta:| 1) 
i (E;to— E))(E,+o+E:) 


aE te (17) 
E;+E,+ £E,.—-E, 


Although each separate term diverges, the sum 
of (14), (15), (16), (17) will lead to an integral 
convergent for large k even if integrated in the 
conventional manner on 6(w?—k*). This is the 
result of Lewis. Integration on g(w?—k?) will 
make each term converge for large k, but will 
then only make correction to the sum of order 
(u/d)? smaller. These we shall neglect. 

The integrals do, however, diverge logarithmi- 
cally at the lower limit of small momentum 
transfer. This infra-red catastrophe has been 
completely cleared up by Bloch and Nordsieck." 
They show that for very long wave-length quanta 
the amplitude for emission and reabsorption of 
more than one quantum is not negligible. In- 
clusion of these higher order terms, which is 
necessary only in the non-relativistic region, 
solves the problem. To keep the results given 
here in a simple form, we can imagine the inte- 
grals to be performed down to some minimum 
momentum Rmin, small compared to up. What is 
effectively the same thing but which is easier 
(because relativistic invariance is maintained) 
for practical purposes, is to imagine that the 
quanta have a very small rest mass Amin. Thus 
we integrate the density 


5(w — R® — Amin®)dwdk 
11F, Bloch and A. Nordsieck, Phys. Rev. 52, 54 (1937). 
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and assume \Amin<u. The two methods are 
equivalent if one replaces InAmin by In(2kmin) — 1. 

The integrals may be expanded in powers of 
q and Q, say up to the second.” The constant 
term vanishes on integration. The integrals 
appearing may all be expressed in terms of 
various parametric derivatives of the integrals 
already given in (11). The result may be ex- 
pressed in terms of a general potential in a very 
simple way. A term linear in q, such as propor- 
tional to g, say, is equivalent to taking the 
matrix element (2|g,exp(iq-R)|1) directly be- 
tween the two states 2, 1. But this is also equiva- 
lent to the matrix element of —7(0/dx) exp(zq- R). 
Thus if the potential varied in any other manner 
in space, one sees by superposition that the 
matrix element is the same as that of —i0V/dx. 
Thus the terms up to second order can be repre- 
sented by matrix elements of first and second 
space and time derivatives of the potential. That 
is, the radiative correction to the scattering in 
any potential is equivalent to the first order in e? 
and in the potential, to the scattering produced 
by a perturbation AH to the Dirac Hamiltonian. 
The perturbation up to terms of first and second 
derivatives of the vector potential A and the 
scalar potential » is calculated in this manner 
to be 


e he 

AH= ——~(8(0-B) ~iBa-E) 
2ahe 2uc 

Lhe mn 


+ (Ce «-Tra)(In ~-)|. (18) 
3y'C? r 8 

The first term, where B=V XA and E=—Vo 

—(1/c)dA/dt, has the same effect as an alteration 

in the electron magnetic moment by a fraction 

e?/2nhc. This effect was first discovered by 

Schwinger.! 


min 


LINE SHIFT 


The perturbation to H given here is useful not 
only for scattering problems but also for the 
line-shift problem. The actual motion of an 
electron in a binding potential can be visualized 


2 The integrals have also been worked out, by other 
methods, for arbitrarily large q and Q. These will appear in 
a future publication, 

Be W. Pauli, Handbuch der Physik (1933), Vol. 24/1, p. 
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as simply a continued sequence of scatterings in 
this potential. For each scattering we can cal- 
culate the effect of virtual quanta in the way 
outlined above. However, it is possible, if the 
potential is strong, that fwo scatterings occur 
between the emission and reabsorption of the 
quantum, in which case the above formula for AH 
is incorrect. In hydrogen the potential over most 
of the atom is sufficiently weak that this does not 
occur with effective probability. The very long 
wave-length quanta do have a tendency to exist 
in the virtual state for long periods, but they 
have been eliminated by the cut-off Amin at low 
frequencies. 

In hydrogen, then, the line shift due to quanta 
above minimum wave number Rmin is the ex- 
pected value, for the state in question, of 


e het 2h*e 
AH= | -;80-Ve+ (V?y) 
2rhe 2uc 3? 
5 
x(n +)I, (19) 
2hkmin 8 


where y=e/r, r being the distance to the proton, 
and we have used the relation 


Ind\min =!n(2Rmin) — 1. 


The first term insures that the fine structure 
separation correction will be that expected from 
the change in the electron’s magnetic moment. 
The second may be combined with Bethe’s non- 
relativistic calculation for quanta below Rmin.'4 


APPLICATION TO OTHER PROCESSES 


The important problem of verifying that the 
self-energy will not diverge in higher-order ap- 
proximations has not been carried to completion. 
It appears unlikely that trouble will arise here. 
If that is true the model probably gives sensible 
answers to all problems of quantum electro- 
dynamics other than those involving Uehling 
polarization effects, discussed below. It has been 
found to give finite self-mass if we have, instead 
of a vector field, a scalar field or a pseudoscalar 
field, coupled to the electron in the simplest way 
possible without gradient operators. If the field 


4 Using Eq. (18), Professor Bethe finds 1050 megacycles 
for the separation between 23, and 2s, in hydrogen. 
(Solvay Report.) 
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quanta have mass M, g(w*—k?) is replaced by 
g(w®— k®— M°), and the values of A of importance 
are chosen to be large compared to M. 

The results for electrodynamics, then, after 
mass renormalization, depend only slightly on 
the form of G(A) and the size of do. Since \o may 
be taken to be extremely large without spoiling 
the smallness of Au, there would appear to be 
good reason to drop the dependence on XA 
altogether. Thus the G(A) appears only as a 
complicated scaffold which is removed after the 
calculation is done. 

On the other hand, electrodynamics probably 
does break down somewhere and it is interesting 
to keep the terms in \ for various phenomena to 
see if one might be selected which is particularly 
sensitive to \. This phenomena would then be a 
promising one to study experimentally. The 
Mller interaction between two electrons is 
modified by the present theory. There is, of 
course, the radiative correction, but in addition 
to that there is simply a change due to the change 
in the density function for the quanta which can 
be exchanged. The Moller interaction ordinarily 
is proportional to 1/g*, where g is the magnitude 
of the momentum transferred from one electron 
to the other in the center of gravity system. The 
modification is only that this factor is changed to 
Sor (1/g—1/(g-+*))G(A)d,. This represents a 
decrease in cross section for hard collisions. If 
is of order 137 yc®, we would need electrons in 
the center of gravity system of roughly 30 Mev 
to find a strong effect. This corresponds, however, 
to bombardment of stationary electrons by elec- 
trons of 34 Bev. 

It is interesting to note that the Mgller inter- 
action can be viewed as simply a correction to 
self-energy due to the exclusion principle. The 
self-energy of two electrons, 1 and 2, is not the 
sum of the self-energy of each, for one of the 
virtual states that 2 could ordinarily enter by 
emission of a quantum is now occupied by 1. The 
difference between the self-energy of two elec- 
trons and the sum of the self-energy of each 


18 A more promising way to obtain processes with high 
momentum transfer would be wide-angle scattering of 
electrons from nuclei. But here deviations from expecta- 
tions might be associated with uncertainties in the nuclear 
charge distributions rather than electrodynamics. Very 
wide angle pair production is a phenomena which does 
occur for high energy incident y-rays with large momentum 
transfer in a region not too close to the nucleus. 
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separately comes out to be just their interaction 
energy. 


VACUUM POLARIZATION, ALTERNATIVE 
CUT-OFF PROCEDURES 

In the above calculation, terms of the type 
discussed by Uehling'® have been omitted. These 
terms represent processes involving a pair pro- 
duction followed by annihilation of the same 
pair. For example, a pair produced by the poten- 
tial may annihilate again emitting a quantum. 
This quantum is then absorbed by the electron 
in state 1 transferring !t to state 2. These terms 
are infinite and are not made convergent by the 
present scheme. There is some point, neverthe- 
less, to solving problems at first without taking 
them into account. This is because their net 
effect is only to alter the effective potential in 
which the electron finds itself, for it may be 
scattered either directly or by the quantum 
produced by the Uehling terms. That is, if this 
problem of polarization of the vacuum is solved 
it will mean, if there is any effect, simply that 
the potential A, y» appearing in the Dirac equa- 
tion and (to high order) in such terms as (18) 
should be replaced by new “polarized” poten- 
tials A’, 9’. 

These polarization terms can be characterized 
in a relativistically invariant manner. All the 
terms which have been calculated above contain 
matrix elements of operators between states in a 
sequence such as 1 tof, f to g, g to 2. The omitted 
polarization terms contain transitions like f to g, 
g to f, 1 to 2. For higher order processes the 
polarization terms are those which do not contain 
a continued sequence of transitions from the 
initial to the final state. 

The polarization terms are not affected in any 
helpful way by the changes in the density of 
quanta. It is likely that this problem will have 
its answer in a changed physical viewpoint. 
However, there is a simple alternative procedure 
to produce finite self-energies which also makes 
convergent the integrals appearing in Serber’s!’ 
treatment of the polarization problem. (Since, 
however, this treatment of Serber already pre- 
supposes a partial subtraction procedure of 
Heisenberg and Dirac, the situation is not so 


clear here as in the self-energy problem.) 


18E, A, Uehling, Phys. Rev. 48, 55 (1935). 
17R. Serber, Phys. Rev. 48, 49 (1935). 
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From the point of view of coordinate space, 
the reason that the electronic self-energy diverges 
appears to be this. A virtual light quantum 
emitted at one point spreads out as 5(#—r?) from 
the origin. The wave packet of the electron 
spreading out after the emission of the quantum 
has, as a consequence of Dirac’s equation, a 
similar discontinuous value along the light cone. 
It is the continued coincidence of these singu- 
larities which makes the matrix element for the 
subsequent absorption of the quantum infinite. 
The method outlined above of changing 5(w?— Rk?) 
to g(w*—k?) has the effect of changing 6(#—r7) 
to f(@—r*) where f(s?) is everywhere finite and 
goes to zero rapidly for |s*| >.1/A0. The quanta 
have been moved away from the electrons so that 
overlap on the light cone is reduced. 

An obvious alternative procedure is to move 
the electron wave function away from the quanta. 
This is easily done in a very similar manner. We 
assume the density of electron states of energy E, 
momentum P to be g(#?—P?—y*) rather than 
6(£?—P?—,?).8 The quanta are conventional, 
wk, density dk/k. The self-energy integrals (2) 
can, of course, be expressed as an integral over 
the intermediate state momentum P, rather than 
k. Replacing dP;/E,; by g(EP—PP— p?)dE,dP,, 
we find 


e 
aE = —— f o(Ep— Pj —p)dEaP, 
wv 


Ey | (OfasAyta;|0) (OjasAy~a,| 0) 
Tl Eytk—-Ey  Eyt+h+Es 


k 
% This is seen to be essentially the method proposed 
by Wataghin. G. Wataghin, Zeits. f. Physik 88, 92 (1934). 
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where k=|Pys—Po|, Eo=(u?+Po?)?. The pro- 
jection operators are unchanged since it is only 
the density of states which we wish to alter. They 
are still Af+=(Eyt-a-Py+Bu)/2E;. The result 
of this calculation is to verify that AE,’ is finite, 
(depending logarithmically on 9). The other 
problems can be analyzed in the same way. 

In the problem of polarization of the vacuum, 
the wave functions of both electron and positron 
ordinarily spread with a singularity on the light 
core. The matrix element for their subsequent 
annihilation is therefore infinite. With the modi- 
fication here described these wave functions are 
made less singular and their overlap integral is 
finite. The polarization integrals in Serber’s 
article’? may now be integrated to yield finite 
results. 

Other than terms which might be removed by 
a small renormalization of charge (depending 
logarithmically on Xo), the net effect in (17) 
would be to change the — (§) in the lastyterm of 
(17) to —(3)—(4). However, the real existence 
of such polarization corrections is, in the author’s 
view, uncertain. These matters will be discussed 
in much more detail in future publications. Also 
reserved for future publication is a more com- 
plete physical theory from which the results 
reported here may be directly deduced. It yields 
much more powerful techniques for setting up 
problems and performing the required integra- 
tions. 

The author would like to express his gratitude 
to Mr. P. V. C. Hough for assistance in the 
calculations and to Professor H. A. Bethe and 
Dr. F. Dyson and many others for useful dis- 
cussions. 
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The problem of the behavior of positrons and electrons in given 
external potentials, neglecting their mutual interaction, is analyzed 
by replacing the theory of holes by a reinterpretation of the solu- 
tions of the Dirac equation. It is possible to write down a complete 
solution of the problem in terms of boundary conditions on the 
wave function, and this solution contains automatically ali the 
possibilities of virtual (and real) pair formation and annihilation 
together with the ordinary scattering processes, including the 
correct relative signs of the various terms. 

In this solution, the “negative energy states” appear in a form 
which may be pictured (as by Stiickeltberg) in space-time as waves 
traveling away from the external potential backwards in time. 
Experimentally, such a wave corresponds to a positron approach- 
ing the potential and annihilating the electron. A particle moving 
forward in time (electron) in a potential may be scattered forward 
in time (ordinary scattering) or backward (pair annihilation). 
When moving backward (positron) it may be scattered backward 


1. INTRODUCTION 


HIS is the first of a set of papers dealing with the 
solution of problems in quantum electrodynamics. 
The main principle is to deal directly with the solutions 
to the Hamiltonian differential equations rather than 
with these equations themselves. Here we treat simply 
the motion of electrons and positrons in given external 
potentials. In a second paper we consider the interactions 
of these particles, that is, quantum electrodynamics. 

The problem of charges in a fixed potential is usually 
treated by the method of second quantization of the 
electron field, using the ideas of the theory of holes. 
Instead we show that by a suitable choice and inter- 
pretation of the solutions of Dirac’s equation the prob- 
lem may be equally well treated in a manner which is 
fundamentally no more complicated than Schrédinger’s 
method of dealing with one or more particles. The vari- 
ous creation and annihilation operators in the conven- 
tional electron field view are required because the 
number of particles is not conserved, i.e., pairs may be 
created or destroyed. On the other hand charge is 
conserved which suggests that if we follow the charge, 
not the particle, the results can be simplified. 

In the approximation of classical relativistic theory 
the creation of an electron pair (electron A, positron B) 
might be represented by the start of two world lines 
from the point of creation, 1. The world lines of the 
positron will then continue until it annihilates another 
electron, C, at a world point 2. Between the times 4; 
and ¢ there are then three world lines, before and after 
only one. However, the world lines of C, B, and A 
together form one continuous line albeit the “positron 
part” B of this continuous Jine is directed backwards 
in time. Following the charge rather than the particles 
corresponds to considering this continuous world line 


in time (positron scattering) or forward (pair production). For 
such a particle the amplitude for transition from an initial to a 
final state is analyzed to any order in the potential by considering 
it to undergo a sequence of such scatterings, 

The amplitude for a process involving many such particles is 
the product of the transition amplitudes for each particle. The 
exclusion principle requires that antisymmetric combinations of 
amplitudes be chosen for those complete processes which differ 
only by exchange of particles. It seems that a consistent interpre- 
tation is only possible if the exciusion principle is adopted. The 
exclusion principte need not be taken into account in intermediate 
states. Vacuum problems do not arise for charges which do not 
interact with one another, but these are analyzed nevertheless in 
anticipation of application to quantum electrodynamics. 

The results are also expressed in momentum-energy variables. 
Equivalence to the second quantization theory of holes is proved 
in an appendix. 


as a whole rather than breaking it up into its pieces. 
It is as though a bombardier flying low over a road 
suddenly sees three roads and it is only when two of 
them come together and disappear again that he realizes 
that he has simply passed over a long switchback in a 
single road. 

This over-all space-time point of view leads to con- 
siderable simplification in many problems. One can take 
into account at the same time processes which ordi- 
narily would have to be considered separately. For 
example, when considering the scattering of an electron 
by a potential one automatically takes into account the 
effects of virtual pair productions. The same equation, 
Dirac’s, which describes the deflection of the world line 
of an electron in a field, can also describe the deflection 
(and in just as simple a manner) when it is large enough 
to reverse the time-sense of the world line, and thereby 
correspond to pair annihilation. Quantum mechanically 
the direction of the world lines is replaced by the 
direction of propagation of waves. 

This view is quite different from that of the Hamil- 
tonian method which considers the future as developing 
continuously from out of the past. Here we imagine the 
entire space-time history laid out, and that we just 
become aware of increasing portions of it successively. 
In a scattering problem this over-all view of the com- 
plete scattering process is similar to the S-matrix view- 
point of Heisenberg. The temporal order of events dur- 
ing the scattering, which is analyzed in such detail by 
the Hamiltonian differential equation, is irrelevant. The 
relation of these viewpoints will be discussed much more 
fully in the introduction to the second paper, in which 
the more complicated interactions are analyzed. 

The development stemmed from the idea that in non- 
relativistic quantum mechanics the amplitude for a 
given process can be considered as the sum of an ampli- 
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tude for each space-time path available.! In view of the 
fact that in classical physics positrons could be viewed 
as electrons proceeding along world lines toward the 
past (reference 7) the attempt was made to remove, in 
the relativistic case, the restriction that the paths must 
proceed always in one direction in time. It was dis- 
covered that the results could be even more easily 
understood from a more familiar physical viewpoint, 
that of scattered waves. This viewpoint is the one used 
in this paper. After the equations were worked out 
physically the proof of the equivalence to the second 
quantization theory was found? 

First we discuss the relation of the Hamiltonian 
differential equation to its solution, using for an example 
the Schrédinger equation. Next we deal in an analogous 
way with the Dirac equation and show how the solu- 
tions may be interpreted to apply to positrons. The 
interpretation seems not to be consistent unless the 
electrons obey the exclusion principle. (Charges obeying 
the Klein-Gordon equations can be described in an 
analogous manner, but here consistency apparently 
requires Bose statistics.)? A representation in momen- 
tum and energy variables which is useful for the calcu- 
lation of matrix elements is described. A proof of the 
equivalence of the method to the theory of holes in 
second quantization is given in the Appendix. 

2. GREEN’S FUNCTION TREATMENT OF 
SCHRODINGER’S EQUATION 

We begin by a brief discussion of the relation of the 
non-relativistic wave equation to its solution. The ideas 
will then be extended to relativistic particles, satisfying 
Dirac’s equation, and finally in the succeeding paper to 
interacting relativistic particles, that is, quantum 
electrodynamics, 

The Schrédinger equation 


iay/at= Hy, (1) 


describes the change in the wave function y¥ in an 
infinitesimal time Af as due to the operation of an 
operator exp(—iHAt). One can ask also, if ¥(x:, 41) is 
the wave function at x; at time /), what is the wave 
function at time f2>¢,? It can always be written as 


(Xe, w= {Ke to5 X1, EDO (Xy, AEX), (2) 


where K is a Green’s function for the linear Eq. (1). 
(We have limited ourselves to a single particle of co- 
ordinate x, but the equations are obviously of greater 
generality.) If H is a constant operator having eigen- 
values E,, eigenfunctions ¢n so that ¥(x, ¢:) can be ex- 
panded as 23 n Cuda(x), then (x, f2)=exp(—iEn(te—h)) 
XCaon(x). Since Ca=S on*(x1)¥ (mi, f1)d°x1, One finds 

1 R. P. Feynman, Rev. Mod. Phys. 20, 367 (1948). | 

2The equivalence of the entire procedure (including photon 
interactions) with the work of Schwinger and Tomonaga has been 
demonstrated by F. J. Dyson, Phys. Rev. 75, 486 (1949). 

2 These are special examples of the general relation of spin and 
statistics deduced by W. Pauli, Phys. Rev. 58, 716 (1940). 


FEYNMAN 


(where we write 1 for x1, 4: and 2 for x», ¢) in this case 
K(2, 1) =D ba(Ka)On*(x1) exp(—iEa(te— 1), (3) 


for ¢2>¢;. We shall find it convenient for 4.<é, to define 
K(2,1)=0 (Eq. (2) is then not valid for <4). It is 
then readily shown that in general K can be defined by 
that solution of 

(48/8le~ Hx) K (2, 1)=i8(2, 1), (4) 
which is zero for f2<¢,, where 6(2, 1)= 6(tg—t1)6(x2—21) 
X45(y2—41)6(ze—21) and the subscript 2 on Hz means 
that the operator acts on the variables of 2 of K(2, 1). 
When JZ is not constant, (2) and (4) are valid but K is 
less easy to evaluate than (3).! 

We can call K(2, 1) the total amplitude for arrival 
at Xe, é, starting from xj, 4). (It results from adding an 
amplitude, expiS, for each space time path between these 
points, where S is the action along the path.!) The 
transition amplitude for finding a particle in state 
x (Xe, fo) at time éo, if at 4 it was in W(x, ;), is 


feoee, 1) (1)d®xid*x2. (5) 


A quantum mechanical system is described equally well 
by specifying the function K, or by specifying the 
Hamiltonian H from which it results. For some purposes 
the specification in terms of K is easier to use and 
visualize. We desire eventually to discuss quantum 
electrodynamics from this point of view. 

To gain a greater familiarity with the K function and 
the point of view it suggests, we consider a simple 
perturbation problem. Imagine we have a particle in 
a weak potential U(x, ¢), a function of position and 
time. We wish to calculate K(2,1) if U differs from 
zero only for ¢ between ¢, and ta. We shall expand K in 
increasing powers of U: 

K(2, 1)=Ko(2, 1)+K(2, 1I)+K(2, t)+-+-. (6) 


To zero order in U, K is that for a free particle, Ko(2, 1).! 
To study the first order correction K(2, 1), first con- 
sider the case that U differs from zero only for the 
infinitesimal time interval Af; between some time /; 
and ¢3-++ Ats(t)<¢3<¢2). Then if ¥(1) is the wave function 
at x1, /1, the wave function at Xs, fy is 


¥(3)= f Kol, yd, (7) 


since from f, to ¢3 the particle is free. For the short 
interval At; we solve (1) as 
W(x, t+ Als) = exp(— iH Als)¥(x, ts) 

= (1—iHoAls—iU Als) (x, 13), 


‘For a non-relativistic free particle, where $.=exp(ip-x), 
E,=p*/2m, (3) gives, as is well known 


K,(2, y= f exp[ ~ (ép-x1—ip-x2)—sp*(t2— 41) /2m Jap(24)-3 
= (rim (t4—h))7! exp (him Qa— x1) (2-1) 7) 
for #>h, and Ky=0 for .<é. 
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where we put H=Hy+U, Hp being the Hamiltonian 
of a free particle. Thus (x, és+Aés) differs from 
what it would be if the potential were zero (namely 
(1—7HoAls)¥(x, t3)) by the extra piece 


Ay= —iU (xs, ts) (xa, ts) Abs, (8) 


which we shall call the amiplitude scattered by the 
potential. The wave function at 2 is given by 


(Xo, bo) = ff Kotas, ta5 Xs, dat Als) W(Xa, ts + Als)d*xs, 


since after f;+Ats the particle is again free. Therefore 
the change in the wave function at 2 brought about by 
the potential is (substitute (7) into (8) and (8) into 
the equation for W(X, f2)): 


ay(a)=—i f Kol, 9UG)K3, WME ad asAb 


In the case that the potential exists for an extended 
time, it may be looked upon as a sum of effects from 
each interval A/; so that the total effect is obtained by 
integrating over ¢3 as well as x3. From the definition (2) 
of K then, we find 


K(2, 1)= ~if Ko, 3)U(3)Ko(3, 1)drs, = (9) 


where the integral can now be extended over all space 
and time, dr3;=4®x;d/3. Automatically there will be no 
contribution if ¢; is outside the range ¢, to ¢2 because of 
our definition, Ko(2, 1)=0 for a<h. 

We can understand the result (6), (9) this way. We 
can imagine that a particle travels as a free particle 
from point to point, but is scattered by the potential U. 
Thus the total amplitude for arrival at 2 from 1 can 
be considered as the sum of the amplitudes for various 
alternative routes. It may go directly from 1 to 2 
(amplitude Ko(2, 1), giving the zero order term in (6)). 
Or (see Fig. 1(a)) it may go from 1 to 3 (amplitude 
Ko(3, 1)), get scattered there by the potential (scatter- 
ing amplitude —1U(3) per unit volume and time) and 
then go from 3 to 2 (amplitude K (2, 3)). This may 
occur for any point 3 so that summing over these 
alternatives gives (9). 

Again, it may be scattered twice by the potential 
(Fig. 1(b)). It goes from 1 to 3 (Ko(3, 1)), gets scattered 
there (—iU(3)) then proceeds to some other point, 4, 
in space time (amplitude Ko(4, 3)) is scattered again 
(—iU(4)) and then proceeds to 2 (Ko(2, 4)). Summing 
over all possible places and times for 3, 4 find that the 
second order contribution to the total amplitude 
K®(2, 1) is 


Cif fre, sumo, » 
X U(3)Ko(3, i)dradts. (10) 
This can be readily verified directly from (1) just as (9) 


ANINCGIOENT WAVES 


SPACE 
(0) FIRST ORDER,EQ (8) (b} SECOND ORDER, £0.11} 


Fic. 1. The Schr cinae (and Dirac) equation can be visualized 
as describing the fact that plane waves are scattered successively 
by a potential. Figure 1 (a) illustrates the situation in first order. 
Ko(2, 3) is the amplitude for a free particle starting at point 3 
to arrive at 2. The shaded region indicates the presence of the 
potential A which scatters at 3 with amplitude ~—iA(3) per 
cm*sec. (Eq. (9)). In (b) is illustrated the second order process 
(Eq. (10)), the waves scattered at 3 are scattered again at 4. How- 
ever, in Dirac one-electron theory Ko(4, 3) would represent elec- 
trons both of positive and of negative energies proceeding from 
3 to 4, This is remedied by choosing a different scattering kernel 
K,(4, 3), Fig. 2. 


was. One can in this way obviously write down any of 
the terms of the expansion (6).° 


3. TREATMENT OF THE DIRAC EQUATION 


We shall now extend the method of the last section 
to apply to the Dirac equation. All that would seem 
to be necessary in the previous equations is to consider 
H as the Dirac Hamiltonian, y as a symbol with four 
indices (for each particle). Then Ko can still be defined 
by (3) or (4) and is now a 4~4 matrix which operating 
on the initial wave function, gives the final wave func- 
tion. In (10), U(3) can be generalized to A4(3)—a@- A(3) 
where A,, A are the scalar and vector potential (times e, 
the electron charge) and @ are Dirac matrices. 

To discuss this we shall define a convenient rela- 
tivistic notation. We represent four-vectors like x, ¢ by 
a symbol x,, where p= 1, 2, 3, 4. and x,=1 is real. Thus 
the vector and scalar potential (times ¢) A, Aq is Ay. 
The four matrices Ba, 8 can be considered as transform- 
ing as a four vector +, (our 7, differs from Pauli’s by a 
factor i for u=1, 2, 3). We use the summation conven- 
tion a,b,= a4b,— 0,5; — @2b2— a3b3= 0-0. In particular if 
a, is any four vector (but not a matrix) we write 
a=a,y, So that @ is a matrix associated with a vector 
(a will often be used in place of a, as a symbol for the 
vector). The y, satisfy yuy.t+y-vn.=25,, where du=+1, 
611: = 522 633= —1, and the other 6,, are zero. As a 
consequence of our summation convention 6,,4,=a, 
and 6,,—4. Note that ab+ ba=2a-6 and that a?=a,a, 
=a-a is a pure number. The symbol @/éx, will mean 
Q/at for n=4, and —@/dx, —d/dy, —d/dz for w=1, 
2, 3. Call V= y,0/dx,= Ba/at-+ Ba- V. We shall imagine 

5 Weare simply solving by successive approximations an integral 
equation (deducible directly from (1) with H=Ho+U and (4) 
with H=H)), 

¥Q)=—if Kel, IUOVSdrt [Ko2, Dy (aes, 


where the first integral extends over all space and all tim 
greater than the ¢; appearing in the second term, and 4>¢ 
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Fic. 2. The Dirac equation permits another solution K4(2, 1) 
if one considers that waves scattered by the potential can proceed 
backwards in time as in Fig. 2 (a). This is interpreted i in the second 
order processes (b), (c), by noting that there is now the possi- 
bility (c) of virtual pair production at 4, the positron going to 3 
to be annihilated. This can be pictured as similar to ordinary 
scattering (b) except that the electron is scattered backwards in 
ume from 3 to 4. The waves scattered from 3 to 2’ in (a) represent 
the possibility of a positron arriving at 3 from 2’ and annihilating 
the electron fron 1. This view is proved equivalent to hole theory: 
electrons traveling backwards in time are recognized as positrons. 


hereafter, purely for relativistic convenience, that ¢n* 
in (3) is replaced by its adjoint ¢n= ¢n"8. 

Thus the Dirac equation for a particle, mass m, in an 
external field A=A,7z is 

(iV ~m) y= Ay, (11) 
and Eq. (4) determining the propagation of a free 
particle becomes 

(iVo—~m)K4(2, 1) = 76(2, 1), (12) 
the index 2 on V; indicating differentiation with respect 
to the coordinates x2, which are represented as 2 in 
K,,(2, 1) and 6(2, 1). 

The function K,(2, 1) is defined in the absence of a 
field. If a potential A is acting a similar function, say 
K,‘4)(2, 1) can be defined. It differs from K,(2, 1) by a 
first order correction given by the analogue of (9) 
namely 


K,02, D=~if K.Q,9AGK.G, Dae, (3) 
representing the amplitude to go from I to 3 as a free 
particle, get scattered there by the potential (now the 


matrix A(3) instead of U(3)) and continue to 2 as free. 
The second order correction, analogous to (10) is 


K,(2, == f [x.2,940) 
< K,(4, 3)A(3)Ki(3, Ddradrs, (14) 
and so on. In general K,‘* satisfies 
(iV.— A(2) —m) K (2, 1) = 78(2, 1), (15) 


and the successive terms (13), (14) are the power series 


expansion of the integral equation 


K,(2, I) =K4(2, 1) 
-1 [ K,2, 9A@KLMG, Dara, (16) 


which it also satisfies. 

We would now expect to choose, for the special solu- 
tion of (12), A;=Ko where Ko(2, 1) vanishes for t2<t; 
and for t2>¢, is given by (3) where ¢, and E, are the 
eigenfunctions and energy values of a particle satis- 
fying Dirac’s equation, and ¢,* is replaced by ¢n. 

The formulas arising from this choice, however, suffer 
from the drawback that they apply to the one electron 
theory of Dirac rather than to the hole theory of the 
positron. For example, consider as in Fig. i(a) an 
electron after being scattered by a potential in a small 
region 3 of space time. The one electron theory says 
(as does (3) with K,= Ko) that the scattered amplitude 
at another point 2 will proceed toward positive times 
with both positive and negative energies, that is with 
both positive and negative rates of change of phase. No 
wave is scattered to times previous to the time of 
scattering. These are just the properties of Ko(2, 3). 

On the other hand, according to the positron theory 
negative energy states are not available to the electron 
after the scattering. Therefore the choice K,=Ko is 
unsatisfactory. But there are other solutions of (12). 
We shall choose the solution defining K,(2, 1) so that 
K,(2, 1) for t2>t is the sum of (3) over positive energy 
stales only. Now this new solution must satisfy (12) for 
all times in order that the representation be complete. 
It must therefore differ from the old solution Ko by a 
solution of the homogeneous Dirac equation. It is clear 
from the definition that the difference Ko— Ky is the 
sum of (3) over all negative energy states, as long as 
/9>t;. But this difference must be a solution of the 
homogeneous Dirac equation for all times and must 
therefore be represented by the same sum over negative 
energy states also for f2<¢,. Since Ko=O in this case, 
it follows that our new kernel, K(2, 1), for to<ty is the 
negative of the sum (2) over negative energy states. That is, 


K4(2, 1) = Dros eq On(2)On(1) 


Xexp(—i£a(le—f)) for b>h (7) 
= —Livee By On(2)On(1) 
Xexp(—iE, (te—t;)) for t<ty. 


With this choice of K, our equations such as (13) and 
(14) will now give results equivalent to those of the 
positron hole theory. 

That (14), for example, is the correct second order 
expression for finding at 2 an electron originally at 1 
according to the positron theory may be seen as follows 
(Fig. 2). Assume as a special example that 42>; and 
that the potential vanishes except in interval t:—/, so 
that /, and /; both lie between /; and /. 

First suppose /4>/3 (Fig. 2(b)). Then (since fs>/1) 
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the electron assumed originally in a positive energy 
state propagates in that state (by K(3, 1)) to position 
3 where it gets scattered (A(3)). It then proceeds to 4, 
which it must do as a positive energy electrou. This is 
correctly described by (14) for K,(4, 3) contains only 
positive energy components in its expansion, as 4>4¢3. 
After being scattered at 4 it then proceeds on to 2, 
again necessarily in a positive energy state, as 2>/4: 

In positron theory there is an additional contribution 
due to the possibility of virtual pair production (Fig. 
2(c)). A pair could be created by the potential A(4) 
at 4, the electron of which is that found later at 2. The 
positron (or rather, the hole) proceeds to 3 where it 
annihilates the electron which has arrived there from 1. 

This alternative is already included in (14) as con- 
tributions for which /,</;, and its study will lead us to 
an interpretation of K4(4,3) for &<¢3, The factor 
K.(2, 4) describes the electron (after the pair produc- 
tion at 4) proceeding from 4 to 2, Likewise K,(3, 1) 
represents the electron proceediug from 1 to 3, K,(4, 3) 
must therefore represent the propagation of the positron 
or hole from 4 to 3. That it does so is clear, The fact 
that in hole theory the hole proceeds in the manner of 
and electron of negative energy is reflected m the fact 
that A4(4, 3) for f44</y is (minus) the sum of only 
negative energy componenits, In hole theory the real 
energy of these intermediate states is, of course, 
positive. This is true here too, since in the phases 
exp(—7En(t4—¢3)) defining K,(4, 3) in (17), E, is nega- 
tive but so is /44—/3. That is, the contributions vary with 
ts as exp(~i|E,| (ts—t4)) as they would if the energy 
of the intermediate state were |E,{. The fact that the 
entire sum is taken as negative in computing K,(4, 3) 
is reflected in the fact that in hole theory the amplitude 
has its sign reversed in accordance with the Pauli 
principle and the fact that the electron arriving at 2 
has been exchanged with one in the sea.® To this, and 
to higher orders, all processes involving virtual pairs 
are correctly described in this way. 

The expressions such as (14) can still be described as 
a passage of the electron from I to 3 (K,(3, 1)), scatter- 
ing at 3 by A(3), proceeding to 4 (K,(4, 3)), scattering 
again, A(4), arriving finally at 2, The scatterings may, 
however, be toward both future and past times, an 
electron propagating backwards in time being recog- 
nized as a positron. 

This therefore suggests that negative energy com- 
ponents created by scattering in a potential be con- 
sidered as waves propagating from the scattering point 
toward the past, and that such waves represent the 
propagation of a positron annihilating the electron in 
the potential,’ 


* It has often been noted that the one-electrou theory apparently 
gives the same matrix elements for this process as does hole theory. 
The problem is one of interpretation, especially in a way that will 
also give correct results for other processes, e.g., self-energy. 

7The idea that positrons can be represented as electrons with 
proper t'me reversed relative to true time has been discussed by 
the author and others, particularly by Sttickelberg. E. C. C. 
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With this interpretation real pair production is also 
described correctly (see Fig. 3). For example in (13) if 
ty <ts<t, the equation gives the amplitude that if at 
time /, one electron is present at 1, then at time /2 just 
one electron will be present (having been scattered at 3) 
and it will be at 2, On the other hand if ¢s is less than ¢3, 
for example, if /2=/,:<¢s, the same expression gives the 
amplitude that a pair, electron at 1, positron at 2 will 
annihilate at 3, and subsequently no particles will be 
present. Likewise if ¢2 and /; exceed ¢s we have (minus) 
the amplitude for finding a single pair, electron at 2, 
positron at 1 created by A(3) from a vacuum. If 
41> 3>t2, (13) describes the scattering of a positron. 
All these amplitudes are relative to the amplitude that 
a vacuum will remain a vacuum, which is taken as 
unity. (This will be discussed more fully later.) 

The analogue of (2) can be easily worked out.? It is, 


v= fR.nNOWHEN, 8) 


where d'V, is the volume element of the closed 3- 
dimensional surface of a region of space time containing 


Fic, 3. Several different processes can be described by the same 
formula depending on the time relations of the variables t2, 41. 
Thus P.| Ky41(2, 1) {? is the probability that: (a) An electron at 
1 will be scattered at 2 (and no other pairs form jn vacuum), 
(b) Electron at 1 and positron at 2 annihilate teaving wothing. 
(c) A single pair at 1 and 2 is created from vacuum. (d) A positron 
at 2 is scattered to 1, (K,.{4(2, 1) is the sun of the effects of 
scattering in the potential to all orders, P, is a normalizing 
constant.) 


Stiickelberg, Helv. Phys. Acta 15, 23 (1942); R. P, Feynman, 
Phys. Rev. 74, 939 (1948). The fact that classically the action 
(Proper time) increases continuously as one follows a trajectory 
is reflected in quantum mechanics in the fact that the phase, which 
is |En|\f2—t}, always increases as the particle proceeds from one 
scattering point to the next. 

*By multiplying (12) on the right by (—i#Vi—m) and noting 
that V16(2, 1)=—V.6(2,1) show that K,(2,1) also satisfies 
K,.(2, ea. BG, 1), where the V; operates on variable 
lin A,.(2, 1) but is written after that function to keep the correct 
order of the y matrices. Multiply this equation by ¥{1) and Eq. 
(11) (with A=0, calling the variables 1) by K4(2, 1), subtract 
and integrate over a region of space-time. The integral on the left- 
hand side can be transformed to an integral over the surface of 
the region. The right-hand side is ¥(2) if the point 2 lies within 
the region, and is zero otherwise. (What happens when the 3- 
surface contains a light line and hence has no unique normal need 
not concern us as these points can be made to occur so far away 
from 2 that their contribution vanishes.) 
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point 2, and N(1) is Nu(t)y, where NV, (1) is the inward 
drawn unit normal to the surface at the point 1. That 
is, the wave function ¥(2) (in this case for a free par- 
ticle) is determined at any point inside a four-dimen- 
sional region if its values on the surface of that region 
are specified, 

To interpret this, consider the case that the 3-surface 
consists essentially of all space at some time say (=0 
previous to é, and of all space at the time T>¢, The 
cylinder connecting these to complete the closure of the 
surface may be very distant from x: so that it gives no 
appreciable contribution (as K.(2,1) decreases expo- 
nentially in space-like directions). Hence, if ys= 8, since 
the inward drawn normals N will be 8 and —8, 


¥(2)= f K,(2, DBp(1)d%x, 
= f K,02, BY), (19) 


where /;=0, ty=T. Only positive energy (electron) 
components in (1) contribute to the first integral and 
only negative energy (positron) components of (1’) to 
the second, That is, the amplitude for finding a charge 
at 2 is determined both by the amplitude for finding 
an electron previous to the measurement and by the 
amplitude for finding a positron after the measurement. 
This might be interpreted as meaning that even in a 
problem involving but one charge the amplitude for 
finding the charge at 2 is not determined when the only 
thing known in the amplitude for finding an electron 
(or a positron) at an earlier time, There may have been 
no electron present initially but a pair was created in 
the measurement (or also by other external fields), The 
amplitude for this contingency is specified by the 
amplitude for finding a positron in the future, 

We can also obtain expressions for transition ampli- 
tudes, like (5), For example if at ¢=0 we have an elec- 
tron present in a state with (positive energy) wave 
function f(x), what is the amplitude for finding it at 
t=T with the (positive energy) wave function g(x)? 
The amplitude for finding the electron anywhere after 
t=0 is given by (19) with ¥(1) replaced by f(x), the 
second integral vanishing, Hence, the transition ele- 
ment to find it in state g(x) is, in analogy to (5), just 
(t2=T, t=0) 


f 9(x:)BK4 (2, 1)8/(xi)d°xrdx,, (20) 


since g*= 98. 

Tf a potential acts somewhere in the interval between 
0 and T, Ky js replaced by K,‘). Thus the first order 
effect on the transition amplitude is, from (13), 


~1 f 0x) 8K,2, NABK.G, Deflxdenda. (21) 


Expressions such as this can be simplified and the 
3-surface integrals, which are inconvenient for rela- 
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tivistic calculations, can be removed as follows. Instead 
of defining a state by the wave function f(x), which it 
has at a given time t;=0, we define the state by the 
function f(1) of four variables x1, 4; which is a solution 
of the free particle equation for all 4; and is f(x;) for 
4;=0. The final state is likewise defined by a function 
g(2) over-all space-time. Then our surface integrals can 
be performed since $K,(3, D8/(x:)d*xi= f(3) and 
S 9(x2)Bd*x2K,(2, 3)= 9(3). There results 

-i foAG)/B)ars, (22) 
the integral now being over-all space-time. The transi- 
tion amplitude to second order (from (14)) is 


- f faraex.e, AL) f()dridrs, (23) 


for the particle arriving at 1 with amplitude f(t) is 
scattered (A(1)), progresses to 2, (K,(2,1)), and is 
scattered again (A(2)), and we then ask for the ampli- 
tude that it is in state g(2). If g(2) is a negative energy 
state we are solving a problem of annihilation of elec- 
tron in f(1), positron in g(2), ete. 

We have been emphasizing scattering problems, but 
obviously the motion in a fixed potential V, say in a 
hydrogen atom, can also be dealt with, If it is first 
viewed as a scattering problem we can ask for the 
amplitude, ¢s{1), that an electron with original free 
wave function was scattered & times in the potential V 
either forward or backward in time to arrive at 1, Then 
the amplitude after one more scattering is 


én0yesi f K,(2, 1)V)eoa(idrn. (24) 


An equation for the total amplitude 
WI=DI galt) 
kno 


for arriving at 1 either directly or after any number of 
scatterings is obtained by summing (24) over all & from 
Oto ~; 


v2) = 40(2)~ f K, VQ) YU )dr,. (25) 


Viewed as a steady state problem we may wish, for 
example, to find that initial condition ¢o (or better just 
the ¥) which leads to a periodic motion of . This is 
most practically done, of course, by solving the Dirac 
equation, 


(V—m)W(1) = Ve), (26) 


deduced from (25) by operating on both sides by :V.2—m, 
thereby eliminating the ¢o, and using (12). This illus- 
trates the relation between the points of view. 

For many problems the total potential A+ V may be 
split conveniently into a fixed one, V, and another, A, 
considered as a perturbation. If A4‘™ is defined as in 
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(46) with V for A, expressions such as (23) are valid 
and useful with K; replaced by K4” and the functions 
f(1), g(2) replaced by solutions for all space and time 
of the Dirac Eq. (26) in the potential V (rather than 
free particle wave functions). 


4, PROBLEMS INVOLVING SEVERAL CHARGES 


We wish next to consider the case that there are two 
(or more) distinct charges (in addition to pairs they may 
produce in virtual states). In a succeeding paper we 
discuss the interaction between such charges. Here we 
assume that they do not interact. In this case each 
particle behaves independently of the other. We can 
expect that if we have two particles a and 8, the ampli- 
tude that particle @ goes from x, at t1, to X3 at ¢3 while 
b goes from Xz at tz to x4 at fy is the product 


K(3, 4; 1, 2)= Ky0(3, 1)K4.(4, 2). 


The symbols a, b simply indicate that the matrices 
appearing in the K, apply to the Dirac four component 
spinors corresponding to particle a or 5 respectively (the 
wave function now having 16 indices). In a potential 
Ki, and Ky, become Kia‘) and Ky.“ where Ky” 
is defined and calculated as for a single particle. They 
commute. Hereafter the a, b can be omitted; the space 
time variable appearing in the kernels suffice to define 
on what they operate. 

The particles are identical however and satisfy the 
exclusion principle. The principle requires only that one 
calculate K(3,4;1,2)—K(4,3;1,2) to get the net 
amplitude for arrival of charges at 3, 4. (It is normalized 
assuming that when an integral is performed over poiuts 
3 and 4, for example, since the electrons represented are 
identical, one divides by 2.) This expression is correct 
for positrons also (Fig. 4). For example the amplitude 
that an electron and a positron found initially at x; and 
x, (Say f;=/4) are later found at x; and x2 (with 
ty=1,>¢1) is given by the same expression 


Ku (43, 1)K 44) (4, 2)- A (4, IAG 41(3, 2). (27) 


The first term represents the amplitude that the electron 
proceeds from { to 3 and the positron from 4 to 2 (Fig. 
4(c)), while the second term represents the interfering 
amplitude that the pair at 1, 4 annihilate and what is 
found at 3, 2 is a pair newly created in the potential. 
The generalization to several particles is clear. There is 
an additional factor A for each particle, and anti- 
symmetric combinations are always taken. 

No account need be taken of the exclusion principle 
in intermediate states. As an example consider again 
expression (14) for /2>/; and suppose /4</, so that the 
situation represented (Mig. 2(c)) is that a pair is made 
at 4 with the electron proceeding to 2, and the positron 
to 3 where it annihilates the electron arriving from 1. 
It may he olijected that if it happens that the electron 
created at 4 is in the same state as the one coming from 
1, then the process cannot occur hecause of the exclusion 
principle and we should not have included it in our 
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Fic. 4. Some problems involving two distinct charges (in addi- 
tion to virtual pairs they may produce): Pyj Ky0V(3, DK 4O(4, 2) 
—K,(4, 1)K,60(3, 2){? is the probability that: (a) Electrons 
at 1 anil 2 are scattered to 3,4 (and no pairs are formied). (b) 
Starting with an electron at 1 a single pair is formed, positron at 2, 
electrons at 3, 4. (c) A pair at 1, 4 is found at 3, 2, etc. The exclu- 
sion principle requires that the amplitudes for processes involving 
exchange of two electrons be subtracted. 


term (14). We shall see, however, that considering the 
exclusion principle also requires another change which 
reinstates the quantity. 

For we are computing amplitudes relative to the 
amplitude that a vacuum at 4; will still be a vacuum at 
to. We are interested in the alteration in this amplitude 
due to the presence of an electron at 1. Now one process 
that can be visualized as occurring in the vacuum is the 
creation of a pair at 4 followed by a re-annihilation of 
the same pair at 3 (a process which we shall call a closed 
loop path). But if a rea! electron is present in a certain 
state 1, those pairs for which the electron was created 
in state 1 in the vacuum must now be excluded. We 
must therefore subtract from aur relative amplitude the 
term corresponding to this process. But this just rein- 
states the quantity which it was argued should not 
have been included in (14), the necessary minus sign 
coming automatically from the definition of K,. It is 
obviously simpler to disregard the exclusion principle 
completely in the intermediate states. 

All the amplitudes are relative and their squares give 
the relative probabilities of the various phenomena. 
Absalute probabilities result if one multiplies each of 
the probabilities by P,, the true probability that if one 
has no particles present initially there will be none 
finally. This quantity 7. can be calculated by normal- 
izing the relative probabilities such that the sum of the 
probabilities of all mutually exclusive alternatives is 
unity. (For example jf ote starts with a vacuunt one can 
calculate the relative probability that there remains a 
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vacuum (unity), or one pair is created, or two pairs, etc. 
The sum is P,~".) Put in this form the theory is com- 
plete and there are no divergence problems. Real proc- 
esses are completely independent of what goes on in 
the vacuum. 

When we come, in the succeeding paper, to deal with 
interactions between charges, however, the situation is 
not so simple. There is the possibility that virtual elec- 
trons in the vacuum may interact electromagnetically 
with the real electrons. For that reason processes occur- 
ing in the vacuum are analyzed in the next section, in 
which an independent method of obtaining P, is 
discussed. 


5. VACUUM PROBLEMS 


An alternative way of obtaining absolute amplitudes 
is to multiply all amplitudes by C,, the vacuum to 
vacuum amplitude, that is, the absolute amplitude that 
there be no particles both initially and finally. We can 
assume C,=1 if no potential is present during the 
interval, and otherwise we compute it as follows. It 
differs from unity because, for example, a pair could be 
created which eventually annihilates itself again. Such 
a path would appear as a closed loop on a space-time 
diagram. The sum of the amplitudes resulting from all 
such single closed loops we call L. To a first approxima- 
tion L is 


1 
J= 
La - f forte, 1)A(1) 
XK, (1, 2)A(2) rides, (28) 


For a pair could be created say at 1, the electron and 
positron could both go on to 2 and there annihilate. 
The spur, Sp, is taken since one has to sum over all 
possible spins for the pair. The factor 4 arises from the 
fact that the same loop could be considered as starting 
at either potential, and the minus sign results since the 
interactors are each —iA. The next order term would be? 


r= +0/3) f ff secK.e, 1)A(1) 


XK4(1, 3)A(3)K4(3, 2)A(2) Jdridredrs, 


etc. The sum of all such terms gives L.!° 


* This term actually vanishes as can be seen as follows. In any 
spur the sign of all y matrices may be reversed. Reversing the 
sign of y in K4(2,1) changes it to the transpose of K4(1, 2) so 
that the order of all factors and variables is reversed. Since the 
integral is taken over all 71, 72, 4nd 13 this has no effect and we are 
left with (— 1)* froin changing the sign of A. Thus the spur equals 
its negative. Loops with an odd number of potential interactors 
give zero. Physically this is because for each loop the electron can 
go around one way or in the opposite direction and we must add 
these amplitudes. But reversing the motion of an electron makes 
it behave like a positive charge thus changing the sign of each 

otential interaction, so that the sum is zero if the number of 
interactions is odd. This theorem is due to W. H. Furry, Phys. 
Rev. 51, 125 (1937). 

© A closed expression for Z in terms of K4‘) is hard to obtain 
because of the factor (1 /n) in the nth term. However, the per- 
turbation in Z, AZ due to a small change in potential AA, is easy 
to express. The (1/m) is canceled by the fact that AA can appear 
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In addition to these single loops we have the possi- 
bility that two independent pairs may be created and 
each pair may annihilate itself again. That is, there may 
be formed in the vacuum two closed loops, and the 
contribution in amplitude from this alternative is just 
the product of the contribution from each of the loops 
considered singly. The total contribution from all such 
pairs of loops (it is still consistent to disregard the 
exclusion principle for these virtual states) is Z?/2 for 
in L? we count every pair of loops twice. The total 
vacuum-vacuum amplitude is then 


C.=t-L+L/2—L/6+++-=exp(—L), (30) 


the successive terms representing the amplitude from 
zero, one, two, etc., loops, The fact that the contribu- 
tion to C, of single loops is —L is a consequence of the 
Pauli principle. For example, consider a situation in 
which two pairs of particles are created. Then these 
pairs later destroy themselves so that we have two 
loops. The electrons could, at a given time, be inter- 
changed forming a kind of figure eight which is a single 
loop. The fact that the interchange must change the 
sign of the contribution requires that the terms in C, 
appear with alternate signs. (The exclusion principle is 
also responsible in a similar way for the fact that the 
amplitude for a pair creation is — K, rather thai + Ky.) 
Symmetrical statistics would lead to 


Cy=14+L+4L?/2=exp(+L). 


The quantity Z has an infinite imaginary part (from 
L®, higher orders are finite). We will discuss this in 
connection with vacuum polarization in the succeeding 
paper. This has no effect on the normalization constant 
for the probability that a vacuum remain vacuum is 
given by 


P,=|C,|?=exp(—2-real part of L), 


from (30). This value agrees with the one calculated 
directly by renormalizing probabilities. The real part 
of L appears to be positive as a consequence of the Dirac 
equation and properties of K, so that P, is less than 
one. Bose statistics gives C,=exp(+L) and conse- 
quently a value of P, greater than unity which appears 
meaningless if the quantities are interpreted as we have 
done here. Our choice of Kx apparently requires the 
exclusion principle. 

Charges obeying the Klein-Gordon equation can be 
equally well treated by the methods which are dis- 
cussed here for the Dirac electrons. How this is done is 
discussed in more detail in the succeeding paper. The 
real part of L comes out negative for this equation so 
that in this case Bose statistics appear to be required 
for consistency.® 


in any of the n potentials, The result after suniming over n by 
(13), (14) and using (16) is 


AL= if SOK, 1)~K4(1, ))dA0) ery. (29) 


The term K,(1, 1) actually integrates to zero, 
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6. ENERGY-MOMENTUM REPRESENTATION 


The practical evaluation of the matrix elements in 
some problems is often simplified by working with 
momentum aitd energy variables rather than space and 
time, This is because the function K4,(2, 1) is fairly 
complicated but we shall find that its Fourier transform 
is very simple, namely (¢/42?)(p—m)7! that is 


K,4(2, 1)= G/42) f (p—m)-!exp(—iperndd'p, (31) 


where p:x21= p:%2— pix = putou— Putin, P= Pun, and 
d‘p means (2n)~dpidpodpadps, the integral over all p. 
That this is true can be seen immediately from (12), 
for the representation of the operator 1V—m in energy 
(ps) and momentum (3,23) space is P—m and the trans- 
form of 8(2,1) is a constant. The reciprocal matrix 
(p—m)-' can be interpreted as (p+m)(p?—m?)! for 
p'~m? = (p—m)(p+m) is a pure number not involving 
y matrices. Hence if one wtshes one can write 


K,(2, 1)=i(iVet+m)I, (2, 1), 
where 


1, (2, 1)= (Qe) f (p2— m2)! exp(—ip-xa)d'p, (32) 


is not a matrix operator but a function satisfying 
C77+(2, 1)~ mI,(2, 1)=6(2, 1), (33) 


where —[_]:?= (V2)?= (8/dx2y)(0/dx2,). 

The integrals (31) and (32) are not yet completely 
defined for there are poles in the integrand when 
b’~m’=0. We can define how these poles are to be 
evaluated by the rule that m is considered lo have an 
infinitesimal negative imaginary parl. That is m, is re- 
placed by m—ié and the limit taken as 5-0 from above. 
This can be seen by imagining that we calculate Ky by 
integrating on p, first. If we call E=+(m'*+p? 
+p2’+ p)} then the integrals involve p, essentially as 
S exp(—tps(te—ts))dp4(pe—E*)-! which has poles at 
ps= +E and py=— E. The replacement of m by m—i8 
means that E has a small negative imaginary part; the 
first pole is below, the second above the real axis. Now 
if f2~/,4>0 the contour can be completed around the 
semicircle below the real axis thus giving a residue from 
the f=+E pole, or —(2E)“exp(—iE(t~t;)). If 
tz-t<0 the upper semicircie must be used, and 
ps= — Eat the pole, so that the function varies in each 
Case as required by the other definition (17). 

Other solutions of (12) result from other prescrip- 
tions. For example if ps in the factor (p?—m?)—! is con- 
sidered to have a positive imaginary part K, becomes 
replaced by Ko, the Dirac one-electron kernel, zero for 
to<h. Explicitly the function is!! (x, f= 2x21,) 


T4.(x, 2) = — (4)'8(8?) + (m/895) HH (ms), (34) 
where s=+(2—x*)! for 2>x? and s=—2(x°—f)! for 


4 Tix, t) is (24)\(Di(x, 1) -iD(x, )) where D, and D are the 
functions defined by W. Pauli, Rev. Mod. Phys. 13, 203 (1941). 


P<x, H,® is the Hankel function and 8(s?) is the 
Dirac delta function of s?, It behaves asymptotically 
as exp(—ims), decaying exponentially in space-ltke 
directions.” 

By means of such transforms the matrix elements 
like (22), (23) are easily worked out. A free particle 
wave function for an electron of momentum py is 
#4, exp(—ipi:«) where 1, is a constant spinor satisfying 
the Dirac equatton ~,4,;=mu, so that p=m?, The 
matrix element (22) for going from a state pj, #4; to a 
state of momentum po, spinor we, is —4m1(té2a(g)u) 
where we have imagined A expanded in a Fourier 
integral 


AW) = face) exp(— ig-x)d'q, 


and we select the component of momentum g= p2— fi. 
The second order term (23) is the matrix element 
between wu, and wu of 


— anti f (a(pr— pi— gM prt q—mya(g)dg, (35) 


since the electron of momentum p, may ptck up g from 
the potential a(q), propagate with momentum /~,+q 
(factor (p)+g—m)~) until it is scattered agatn by the 
potential, a(p2—fi—g), picking up the remaining mo- 
mentum, p2—pi—q, to bring the total to po. Since all 
values of g are possible, one integrates over q. 

These same matrices apply directly to positron prob- 
lems, for if the time component of, say, pi is negative 
the state represents a positron of four-momentum — py, 
and we are describing pair production if p2 is an elec- 
tron, ie., has positive time component, etc. 

The probability of an event whose matrix element is 
(ti2M1) is proportional to the absalute square. This 
may also be written (@Afuo)(i,.Mu,), where M is M 
with the operators written in opposite order and explicit 
appearance of i changed to — i(M is B times the complex 
conjugate transpose of 8M). For many problems we are 
not Concerned about the spin of the final state. Then we 
can sum the probability over the two 12 corresponding 
to the two spin directions. This is not a complete set be- 
cause p; has another cigenvalue, —m. To permit sum- 
ming over all states we can insert the projection operator 
(2m)-\(po+m) and so obtain (2m)~!(i,M (p2+m)M m1) 
for the probability of transition from pi, #1, to p2 with 
arbitrary spin. If the incident state is unpolarized we 
can sum on its spins too, and obtain 


(2m)-2S pl (pit m) M (pot m) M (36) 


for (twice) the probability that an electron of arbitrary 
spin with inomentum 9, will make transition to po The 
expressions are all valid for positrons when p’s with 


If the —76 is kept with m here too the function 7, approaches 
zero for infinite positive and negative times, This may be useful 
in generat analyses in avoiding complications from infinitely 
remote surfaces. 
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negative energies are inserted, and the situation inter- 
preted in accordance with the timing relations discussed 
above. (We have used functions normalized to (%u)= 1 
instead of the conventional (%#8u)=(u*u)=1. On our 
scale (%#8u)=energy/m so the probabilities must be 
corrected by the appropriate factors.) 

The author has many people to thank for fruitful 
conversations about this subject, particularly H. A. 
Bethe and F. J. Dyson. 


APPENDIX 
a. Deduction from Second Quantization 


In this section we sha]] show the equivalence of this theory with 
the hole theory of the positron.? According to the theory of second 
quantization of the electron field in a given potential,'? the state 
of this field at any time is represenied by a wave function x 
satisfying 

40x/d1= Hx, 

where H= {¥*(x)(a-(—iY~—A)+As+m8)¥(x)d*x and (x) is 
an operator annihilating an electron at position x, while ¥*(x) is 
the corresponding creation operator. We contemplate a situation 
in which at £= 0 we have present some electrons in states repre- 
sented by ordinary spinor functions f;(x), f2(x), ++: assumed 
orthogonal, and some positrons, These are described as holes in 
the negative energy sea, the electrons which would normally fill the 
holes having wave functions p;(x), p2(x), +--. We ask, at time T 
what is the amplitude that we find electrons in states gi(x), 
g2(x), ++» and holes at gi(x), go(x), ---. If the initial and finat state 
vectors representing this situation are x, and x, respectively, we 
wish to calculate the matrix element 


R= (x exp( - i Has) x.) = (xs* Sx). (37) 


We assume that the potential 4 differs from zero only for times 
between 0 and T so that a vacuum Can be defined at these times. 
If xo represents the vacuum state (that is, all negative energy 
states filled, all positive energies empty), the amplitude for having 
a vacuum at time T, if we had one at ¢=9, is 


Co™ (x0*Sx0), (38) 


writing S for exp(—ifo7Hdt). Our problem is to evaluate R and 

show that it is a simple factor times C,, and that the factor involves 

the K,‘4) functions in the way discussed in the previous sections, 
To do this we first express x, in terms of xo. The operator 


a= [v(x o(x)d*x, (39) 


creates an electron with wave function (x). Likewise b= {$*(x) 
X ¥(x)d?x annihilates one with wave function ¢(x). Hence state 
xi is x= Fi*F,*--- Pi Ps--+xo while the final state is G,*G.*--- 
XQiQ2:++xo where F., G,, P;, Q, are operators defined like %, in 
(39), but with f., 2, 2. g replacing ¢; for the initial state would 
result from the vacuum if we created the electrons in fi, fa, -+- 
and annihilated those in pi, p2, --. Hence we must find 
R= (x0%+ + Q2*QO1*+ + GaGiS Fi" Fy +++ Py Pat + x0). (40) 
To simplify this we shall have to use commutation relations be- 
tween a $* operator and S, To this end consider exp(—i/o'Hdt')b* 
Xexp(+ifo'Hdt') and expand this quantity in terms of ¥*(x), 
giving S¥*(x)(x, £)d°x, (which defines $(x, #)). Now multiply 
this equation by exp(+i/o'Hdt’)---exp(—if/o'Hdt’) and find 


Svrmowar= f¥(x, noe, nex, an 
where we have defined W(x, #) by W(x, 4) =exp(+i/o'Hd/’) ¥(x) 
3 See, for example, G. Wentzel, Einfukrung in die Quanten- 


heer der Wdtenfelder (Franz Deuticke, Leipzig, 1943), Chap- 
ter V. 
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Xexp(—if'Hadt’). As is well known (x, #) satisfies the Dirac 
equation, (differentiate W(x, 4) with respect to and use commuta~ 
tion relations of H and ¥) 


id (x, t)/Ot= (a+ (—iY—A)-+ Ait mp) (x, 2). (42) 


Consequently ${x, 4) must also satisfy the Dirac equation (differ- 
entiate (41) with respect to #, use (42) and integrate by parts). 
That is, if ¢(x, 7) is that solution of the Dirac equation at time 
T which is $(x) at 4=0, and if we define O* = fW*(x)p(x)d3x and 
b= f¥*(x)o(x, T)d*x then &*=Sb*S"!, or 
Sh*= O'S, (43) 


The principle on which the proof will be based can now be 
illustrated by a simple example. Suppose we have just one electron 
initially and finally and ask for 


r= (xo"GSF* xo). (44) 
We might try putting F* through the operator S using (43), 


SF*=F'*S, where f‘ in F* = f ¥*(x)f'(x)d3x is the wave function 
at T arising from f(x) at 0. Then 


r= (xo"GF*S x0) = fs (x)f"(x)d3x-Cy~ (x0*F’*GSx0), (45) 


where the second expression has been obtained by use of the defi- 
nition (38) of C, and the general commutation relation 


GF+PeG= feces ce)d'x, 


which is a consequence of the properties of W(x) (the others are 
FG=—GF and F*G*= —G*F*), Now xo*F™ in the last term in 
(45) is the complex conjugate of F’xo. Thus if f’ contained only 
positive energy components, F’xo would vanish and we would have 
reduced r to a factor times Cy. But F’, as worked out here, does 
contain negative energy components created in the potentiat A 
and the method must be stightly modified. 

Before putting F* through the operator we shalt add to it 
another operator F’’* arising from a function f’’(x) containing only 
negative energy components and so chosen that the resulting f’ 
has onty positive ones, That is we want 


S(Fpos* + Fueg’*) =F pos *5; (46) 


where the “pos” and ‘neg’”’ serve as reminders of the sign of the 
energy components contained in the operators. This we can now 


use in the form 
SF pos” = Fos *S — SF neg’: (47) 


In our one electron problem this substitution replaces r by 1wo 


lerms 
= (x0"GF pos *S x0) — (xo"GSF neg * x0). 


The first of these reduces to 


r= fe"(x) fooe'(x)d°x-Cr, 


as above, for Fyos’xo is now zero, while the second is zero since the 
creation operator Fneg’’* gives zero when acting on the vacuum 
state as all negative energies are full, This is the central idea of 
the demonstration. 

The problem presented by (46) is this: Given a function fpos(x) 
at time 0. to find the amount, {neg of negative energy component 
which must be added in order that the solution of Dirac’s equa- 
tion at time T will have only positive energy components, fos» 
This is a boundary value problem for which the kernel K,(4? is 
designed, We know the positive energy components initially, pow 
and the negative ones finally (zero). The positive ones finally are 
therefore (using (19)) 


Soos'(x2) =f Ky(2, DB fpoo(x 4X1, (48) 
where #=T, 4 =0. Similarly, the negative ones initially are 


Soca” (X2) = f K(2, DBfoou0xi)dx:= Sports), (49) 


where #2 approaches zero from above, and 4=0. The fpos(Xz) is 
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subtracted to keep in fneg’(X2) only those waves which return 
from the potential and not those arriving directly at t from the 
K,(2, 1) part of K,"41(2, 1), as t-+0. We could also have written 


fore (X)= fTKs(2, D-K4(2, 0 V8fooulxr)d%:. (80) 


Therefore the one-electron prohlem, 7= f-g*(x)fpos (x)d°x-Ci, 
gives by (48) 


r= Gof gOdK MQ, Nasedd nd, 


as expected in accordance with the reasoning of the previous sec- 
tions (ie., (20) with Ay) replacing A,). 

The proof is readily extended to the more general expression R, 
(40), which can be analyzed by induction. First one replaces Fi* 
by a relation such as (47) obtaining two terms 


R= (xo*+ + +Q2*Qi*- - + GiGiF ipos *SFi*> Pi P2- - + x0) 

—(x0% = O2*Oi* +» GiGi SF inex * Fa" + > -PiPo---x0). 
In the first term the order of Fiyos’* and G, is then interchanged, 
producing an additional term / g1*(X)fivow (x)d°x times an expres- 
sion with one less electron in initial and final state. Next it is 
exchanged with G; producing an addition — f'gs*(x) fipos (x)d°x 
times a similar term, etc. Finatly on reaching the Q,* with which 
it anticommutes it can be simply moved over to juxtaposition 
with xo* where it gives zero, The second term is similarly handled 
by moving Fineg’”’* through anti commuting /2*, etc., unui it 
reaches P. Then it is exchanged with Pi to produce an addi- 
tional simpler term with a factor FS p:*(x)fioee’(x)d°x or 
ES pi*(x2)K 442, DBSi(x:)d2xd3x, from (49), with t.= 4 =0 (the 
extra f1(X2) in (49) gives zero as it is orthogonal to pi(X2)). This 
describes in the expected manner the annihilation of the pair, 
electron fi, positron pi. The Freg’’* is moved in this way succes- 
sively through the P’s until it gives zero when acting on xo. Thus 
R is reduced, with the expected factors (and with alternating signs 
as required by the exclusion principle), to simpler terms containing 
two less operators which may in turn be further reduced by using 
F," in a similar manner, etc. After all the F* are used the Q*’s 
can be reduced in a similar manner, They are moved through the 
Sin the opposite direction in such a manner as to produce a purely 
negative energy operator at time 0, using relations analogous to 
(46) to (49). After all this is done we are left simply with the ex- 
pected factor times C+ (assuming the net charge is the same in 
initial and final state.) 

In this way we have written the solution to the general problem 
of the motion of electrons in given potentials. The factor Cv is 
obtained by normalization, However for photon fields it is desir- 
able to have an explicit form for Cy in terms of the potentials. 
This is given by (30) and (29) and lt is readily demonstrated that 
this also is correct according to second quantization, 


b. Analysis of the Vacuum Problem 


We shall calculate C. from second quantization by induction 
considering a series of problems each containing a potential dis- 
tribution more nearly like the one we wish, Suppose we know Cy 
for a problem like the one we want and having the sanie potentials 
for time ¢ between some fo and 7, but having potential zero for 
times fron1 0 to to. Call this Cv(to), the corresponding Hamiltonian 
Ht and the sum of contributions for all single loops, L(to). Then 
for to T we have zero potential at all times, no pairs can he 
produced, L(T)=0 and C.(T)=1. For to=0 we have the com- 
plete problem, so that Cv(0) is what is defined as Cv in (38). 
Generally we have, 


Cy(to) = (x0 exp(~if” eat) xo) 
= (xo exo(-if” Hot) xs); 


since H to is identical to the constant vacuum Hamiltonian Hr for 
#€to and xo is an eigenfunction of Hr with an eigenvalue (energy 
of vacuum) which we can take as zero. 
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The value of Cv(to— Ato) arises from the Hamiltonian Hto— ato 
which differs from /to just by having an extra potential during 
the short interval Ato. Hence, to first order in At, we have 


Cutto=Ate)= (xo? en(-if” I1tq—atat) x0) 
=(x" exp( -i i. Huat)[ — ite f ¥*(x) 
og: 


XK (— @- A(x, to) +A a(x, yd] x0); 


we therefore obtain for the derivative of C, the expression 
Z ner er oft 
dC. (lo) /dto i( xs exp( if, Heat) 


x f'W*O)BAL, lo) W(x)ax x0), (St) 


which will be reduced to a simple factor times Co(to) by methods 
analogous to those used in reducing R. The operator ¥ can be 
imagined to be split into two pieces Wpos and Waeg operating on 
Positive and negative energy states respectively. The Wyo. on xo 
gives zero so we are left with two terms in the current density, 
Wor BAW¥ neg And Voeg*BAWneg: The latter Wueg*GAV eg is just 
the expectation value of 8A taken over all negative energy: states 
(minus Woeg8AWVneg* which gives zero acting on xo). This is the 
effect of the vacuum expectation current of the electrons in the 
sea which we should have subtracted from our original Hamil- 
tonian in the customary way. 

The reniaining term Wpos"BAW neg, Or its equivalent Wpos*fAY 
can he considered as W*(x)fpo0(X) where fpo0(x) is written for the 
Positive energy component of the operator BAW(x). Now this 
operator, ¥*(x)fpox(x), or more precisely just the ¥*(x) part of it, 
can be pushed through the exp(—i/to”//dt) in a manner exactly 
analogous to (47) when f is a function. (An alternative derivation 
results from the consideration that the operator W(x, 4) which 
satisfies the Dirac equation also satisfies the linear integral equa- 
tions which are equivalent to it.) That is, (St) can be written 
by (48), (50), 


—dCu(to) /dto= ~i( eo f fern Kk ne, t 
xexp( - i f a Hat) A(1) W(x xd xax0) 


+i( xo" exp( -i fe Hat) if f W*(xq) CK (2, t) 
~ Ky02, DIA Hn)a ud eax0), 


where in the first term 4.=7, and in the second te+4=4,. The 
(A) in K,™ refers to that part of the potential A after to. The 
first term vanishes for lt involves (from the K,{41(2, 1)) only 
positive energy components of ¥*, which give zero operating into 
xo". In the second term only negative components of ¥*(x,) 
appear. If, then ¥*(x,) is interchanged in order with ¥(x,) it will 
give zero operating on xo, and only the term, 
—dCu(to) /dto= +e f SpC(K 4 (1, 1) 

— Ky(1, IAM) Jax, Cito), (52) 


will remain, from the usual commutation relation of ¥* and ¥. 

The factor of Cv(to) in (52) times —At%y is, according to (29) 
{reference 10), just L(to~ Ste) ~ L(t) since this difference arises 
from the extra potential AA=A during the short time interval 
Mty. Hence —dCo(to) /dto= + (dL (to) /dlo)C (to) so that integration 
from fo T to to=0 establishes (30). 

Starting from the theory of the electromagnetic field in second 
quantization, a deduction of the equations for quantum electro- 
dynamics which appear in the succeeding paper may be worked 
out using very similar principles. The Pauli-Weisskopf theory of 
the Klein-Gordon equation can apparently be analyzed in essen- 
tially the same way as that used here for Dirac electrons. 
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In this paper two things are done. (1) It is shown that a con- 
siderable simplification can be attained in writing down matrix 
elements for complex processes in electrodynamics. Further, a 
physical point of view is available which permits them to be 
written down directly for any specific problem. Being simply a 
restatement of conventional electrodynamics, however, the matrix 
vlements diverge for complex processes. (2) Electrodynamics fs 
modified by altering the interaction of electrons at short distances, 
All matrix elements are now finite, with the exception of those 
relating to problems of vacuum polarization. The latter are 
evaluated in a manner suggested by Pauli and Bethe, which gives 
finite results for these matrices also. The only effects sensitive to 
the modification are changes in mass and charge of the electrons. 
Such changes could not be directly observed. Phenomena directly 
observable, are insensitive to the details of the modification used 
(except at extreme energies). For such phenomena, a limit can 
be taken as the range of the modification goes to zero. The results 
then agree with those of Schwinger. A complete, unambiguous, 


and presumably consistent, method is therefore available for the 
catculation of ail processes involving electrons and photons. 

The simplification in writing the expressions resuits from an 
emphasis on the over-all space-time view resulting from a study 
of the solution of the equations of electrodynamics. The relation 
of this to the more conventional Hamiltonian point of view fs 
discussed. It would be very difficult to make the modification 
which is proposed If one insisted on having the equations in 
Hamiltonian form, 

The methods apply as well to charges obeying the Klein-Gordon 
equation, and to the various meson theories of nuclear forces. 
Iilustratlve examples are given. Although a modification like that 
used In electrodynamics can make all matrices finite for all of the 
meson theories, for some of the theorles it is no longer true that 
all directly observable phenomena are insensitive to the details of 
the modification used. 

The actual evaluation of integrals appearing in the matrix 
elements may be facilitated, in the simpler cases, by methods 
described in the appendix. 


HIS paper should be considered as a direct con- 
tinuation of a preceding one! (I) in which the 
motion of electrons, neglecting interaction, was ana- 
lyzed, by dealing directly with the solution of the 
Hamiltonian differential equations, Here the same tech- 
nique is applied to include interactions and in that way 
to express in simple terms the solution of problems in 
quantum electrodynamics. 

For most practical calculations in quantum electro- 
dynamics the solution is ordinarily expressed in terms 
of a matrix element. The matrix is worked out as an 
expansion in powers of e?/fc, the successive terms cor- 
responding to the inclusion of an increasing number of 
virtual quanta. It appears that a considerable simplifi- 
cation can be achieved in writing down these matrix 
elements for complex processes. Furthermore, each term 
in the expansion can be written down and understood 
directly from a physical point of view, similar to the 
space-time view in I. It is the purpose of this paper to 
describe how this may be done. We shall also discuss 
methods of handling the divergent integrals which 
appear in these matrix elements. 

The simplification in the formulae results mainly from 
the fact that previous methods unnecessarily separated 
into individual terms processes that were closely related 
physically. For example, in the exchange of a quantum 
between two electrons there were two terms depending 
on which electron emitted and which absorbed the 
quantum. Yet, in the virtual states considered, timing 
relations are not significant. Olny the order of operators 
in the matrix must be maintained. We have seen (I), 
that in addition, processes in which virtual pairs are 
produced can be combined with others in which only 


'R, P. Feynman, Phys, Rev. 76, 749 (1949), hereafter catled I. 


positive energy electrons are involved. Further, the 
effects of longitudinal and transverse waves can be 
combined together. The separations previously made 
were on an unrelativistic basis (reflected in the circum- 
stance that apparently momentum but not energy is 
conserved in intermediate states). When the terms are 
combined and simplified, the relativistic invariance of 
the result is self-evident. 

We begin by discussing the solution in space and time 
of the Schrédinger equation for particles interacting 
instantaneously, The results are immediately general- 
izable to delayed interactions of relativistic electrons 
and we represent in that way the laws of quantum 
electrodynamics. We can then see how the matrix ele- 
ment for any process can be written down directly. In 
particular, the self-energy expression is written down. 

So far, nothing has been done other than a restate- 
ment of conventional electrodynamics in other terms. 
Therefore, the self-energy diverges. A modification? in 
interaction between charges is next made, and it is 
shown that the self-energy is made convergent and 
corresponds to a correction to the electron mass. After 
the mass correction is made, other real processes are 
finite and insensitive to the “width” of the cut-off in 
the interaction.? 

Unfortunately, the modification proposed is not com- 
pletely satisfactory theoretically (it leads to some diffi- 
culties of conservation of energy). It does, however, 
seem consistent and satisfactory to define the matrix 


2 For a discussion of this modification in classical physics see 
R, P. Feynman, Phys, Rev. 74 939 (1948), hereafter referred 
toas A, 

4A brief summary of the methods and results will be found in 
R, P. Feynman, Phys. Rev. 74, 1430 (1948), hereafter referred 
to as B 


769 


99 


100 


770 Ri Py 


element for all real processes as the limit of that com- 
puted here as the cut-off width goes to zero. A similar 
technique suggested by Pauli and by Bethe can be 
applied to problems of vacuum polarization (resulting 
in a renormalization of charge) but again a strict 
physical basis for the rules of convergence is not known. 

After mass and charge renormalization, the limit of 
zero cut-off width can be taken for all real processes. 
The results are tten equivalent to those of Schwinger* 
who does not make explicit use of the convergence fac- 
tors. The method of Schwinger is to identify the terms 
corresponding to corrections in mass and charge and, 
previous to their evaluation, to remove them from the 
expressions for real processes, This has the advantage 
of showing that the results can be strictly independent 
of particular cut-off methods. On the other hand, many 
of the properties of the integrals are analyzed using 
formal properties of invariant propagation functions, 
But one of the properties is that the integrals are infinite 
and it is not clear to what extent this invalidates the 
demonstrations. A practical advantage of the present 
method is that ambiguities can be more easily resolved; 
simply by direct calculation of the otherwise divergent 
integrals. Nevertheless, it is not at all clear that the 
convergelice factors do not upset the physical con- 
sistency of the theory, Although in the limit the two 
methods agree, neither method appears to be thoroughly 
satisfactory theoretically. Nevertheless, it does appear 
that we now have available a complete and definite 
method for the calculation of physical processes to any 
order in quantum electrodynamics. 

Since we can write down the solution to any physical 
problem, we have a complete theory which could stand 
by itself. It will be theoretically incomplete, however, 
in two respects. First, although each term of increasing 
order in ¢*/he can be written down it would be desirable 
to see some way of expressing things in finite form to 
all orders in e*/he at once. Second, although it will be 
physically evident that the results obtained are equiva- 
lent to those obtained by conventional electrodynamics 
the mathematical proof of this is not included. Both of 
these limitations will be removed in a subsequent paper 
(see also Dyson‘). 

Briefly the genes!s of this theory was this. The con- 
ventional electrodynamics was expressed in the La- 
grangian form of quantum mechantes described in the 
Reviews of Modern Physics. The motion of the field 
oscillators could be integrated out (as described in Sec- 
tion 13 of that paper), the result being an expression of 
the delayed interaction of the particles. Next the modi- 
fication of the delta-function interaction could be made 
directly from the analogy to the classical case.’ This 


‘J. Schwinger, Phys, Rev. 74, 1439 (1948), Phys. Rev. 75, 651 
(1949), A proof of this equivalence is given by F. J. Dyson, Phys. 
Rey. 75, 486 (1949). 

5R, P, Feynman, Rev, Mod Phys. 20, 367 (1948). The applica- 
tion to electrodynamics is described in detail by H. J. Groenewold, 
Koninklijke Nederlandsche Akademia van Weteschappen, Pro- 
ceedings Vol. LII, 3 (226) 1949, 
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was still not complete because the Lagrangian method 
had been worked out in detail only for particles obeying 
the non-relativistic Schrédinger equation. It was then 
modified in accordance with the requirements of the 
Dirac equation and the phenomenon of pair creation, 
This was made easier by the reinterpretation of the 
theory of holes (I). Finally for practical calculations the 
expressions were developed in a power series in e?//c. tt 
was apparent that each term in the series had a simple 
physical interpretation, Since the result was easier to 
understand than the derivation, it was thought best to 
publish the results first in this paper. Considerable time 
has been spent to make these first two papers as com- 
plete and as physically plausible as possible without 
relying on the Lagrangian method, because it {fs not 
generally familiar. It is realized that such a description 
cannot carry the conviction of truth which would ac- 
company the derivation, On the other hand, in the 
interest of keeping simple things simple the derivation 
will appear in a separate paper. 

The possible application of these methods to the 
various meson theories is discussed briefly. The formu- 
las corresponding to a charge particle of zero spin 
moving in accordance with the Klein Gordon equation 
are also given. In an Appendix a method is given for 
calculating the integrals appearing in the matrix ele- 
ments for the simpler processes. 

The point of view which is taken here of the inter- 
action of charges differs from the more usual point of 
view of field theory, Furthermore, the familiar Hamil- 
tonian form of quantum mechanics must be compared 
to the over-all space-time view used here, The first 
section is, therefore, devoted to a discussion of the 
relations of these viewpoints. 


1. COMPARISON WITH THE HAMILTONIAN 
METHOD 


Electrodynamics can be looked upon in two equiva- 
{ent and complementary ways, One isas tie description 
of the behavior of a field (Maxwell’s equations). The 
other is as a description of a direct interaction at a 
distance (albeit delayed in time) between charges (the 
solutions of Lienard and Wiechert), From the latter 
point of view light is considered as an interaction of the 
charges in the source with those in the absorber, This ts 
an impractical point of view because many kinds of 
sources produce the same kind of effects, The field point 
of view separates these aspects into two simpler prob- 
lems, production of light, and absorption of light. On 
the other hand, the field point of view is less practical 
when dealing with close collisions of particles (or their 
action on themselves). For here the source and absorber 
are not readily distinguishable, there is an intimate 
exchange of quanta. The fields are so closely determined 
by the motions of the particles that it is just as well not 
to separate the question into two problems but to con- 
sider the process as a direct interaction. Roughly, the 
field point of view is most practical for problems involv- 
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ing real quanta, while the interaction view is best for 
the discussion of the virtual quanta involved. We shall 
emphasize the interaction viewpoint in this paper, first 
because it is less familiar and therefore requires more 
discussion, and second because the important aspect in 
the problems with which we shall deal is the effect of 
virtual quanta. 

The Hamiltonian method is not well adapted to 
represent the direct action at a distance between charges 
because that action is delayed. The Hamiltonian method 
represents the future as developing out of the present. 
If the values of a complete set of quantities are known 
now, their values can be computed at the next instant 
in time. If particles interact through a delayed inter- 
action, however, one cannot predict the future by 
simply knowing the present motion of the particles. 
One would also have to know what the motions of the 
particles were in the past in view of the interaction this 
may have on the future motions. This is done in the 
Hamiltonian electrodynamics, of course, by requiring 
that one specify besides the present motion of the 
particles, the values of a host of new variables (the 
coordinates of the field oscillators) to keep track of that 
aspect of the past motions of the particles which de- 
termines their future behavior. The use of the Hamil- 
tonian forces one to choose the field viewpoint rather 
than the interaction viewpoint. 

In many problems, for example, the close collisions 
of particles, we are not interested in the precise tem- 
poral sequence of events. It is not of interest to be able 
to say how the situation would jook at each instant of 
time during a collision and how it progresses from in- 
stant to instant. Such ideas are only useful for events 
taking a long time and for which we can readily obtain 
information during the intervening period. For collisions 
it is much easier to treat the process as a whole. The 
Miler interaction matrix for the the collision of two elec- 
trons is not essentially more complicated than the uon- 
relativistic Rutherford formula, yet the mathematical 
machinery used to obtain the former from quantum 
electrodynamics is vastly more complicated than 
Schrédinger’s equation with the e’/rie interaction 
needed to obtain the latter. The difference is only that 
in the latter the action is instantaneous so that the 
Hamiltonian method requires no extra variables, while 
in the former relativistic case it is delayed and the 
Hamiltonian method is very cumbersome, 

We shall be discussing the solutions of equations 
rather than the time differential equations from which 
they come. We shail discover that the solutions, because 
of the over-all space-time view that they permit, are as 
easy to understand when interactions are delayed as 
when they are instantaneous. 

As a further point, relativistic invariance will be self- 
evident. The Elamiltonian form of the equations de- 
velops the future from the instantaneous present. But 


¥ § This is the viewpoint of the theory: of the S matrix of [fvisen- 
erg. 
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for different observers in relative motion the instan- 
taneous present is different, and corresponds to 2 
different 3-dimensional cut of space-time. Thus the 
temporal analyses of different observers is different and 
their Hamiltonian equations are developing the process 
in different ways. These differences are irrelevant, how- 
ever, for the solution is the same in any space time 
frame. By forsaking the Hamiltonian method, the 
wedding of relativity and quantum mechanics can be 
accomplished most naturally. 

We illustrate these points in the next section by 
studying the solution of Schrédinger’s equation for non- 
relativistic particles interacting by an instantaneous 
Coulomb potential (Eq. 2). When the solution is modi- 
fied to include the effects of delay in the interaction 
and the relativistic properties of the electrons we obtain 
an expression of the laws of quantum electrodynamics 


(Eq. 4). 
2. THE INTERACTION BETWEEN CHARGES 


We study by the same methods asin I, the interaction 
of two particles using the same notation as I. We start 
by considering the non-relativistic case described by the 
Schrédinger equation (I, Eq. 1). The wave function at 
a given time is a function ¥(Xq, Xs, ¢) of the coordinates 
Xq and Xx, of each particle. Thus call A (Xa, Xs, 1) Xa’, Xs, 1’) 
the amplitude that particle @ at xq’ at time ¢’ will get 
to Xaat ¢ while particle b at x,’ at f/ gets to x, até. If the 
particles are free and do not interact this is 


K (Xa, Xo, £5 Xu’, Xo’, 1’) = Koa(Xa, t; Xa‘) t!) Koo(Xo, £5 Xs’, 1’) 


where Koa is the Ko function for particle a considered 
as free. In this case we can obviously define a quantity 
like A, but for which the time ¢ need not be the same 
for particles a and b (likewise for ¢’); e.g., 


Ko(3, 4; 1, 2)= Koa(3, 1)Kos(4, 2) (1) 


can be thought of as the amplitude that particle @ goes 
from x; at # to x3 at /3 and that particle 6 goes from x2 
at fy to x, at ty. 

When the particles do interact, one can only define 
the quantity A(3, 4; 1,2) precisely if the interaction 
vanishes between f; and f2 and also between f; and fy. 
In a real physical system such js not the case. There is 
such an enormous advantage, however, to the concept 
that we shall continue to use it, imagining that we can 
neglect the effect of interactions between ¢; and f and 
between ¢; and 4%. For practical problems this means 
choosing such long time intervals 4;—/, and l4—fy that 
the extra interactions near the end points have small 
relative effects, As an example, in a scattering problem 
it may well be that the particles are so well separated 
initially and finally that the interaction at these times 
is negligible. Again energy values can be defined by the 
average rate of change of phase over such long time 
intervals that errors initially and finally can be neg- 
lected. Inasmuch as any physical problem can be defined 
in terms of scattering processes we do not lose much in 
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Fig. 1. The fundamental interaction Eq. (4). Exchange of one 
quantum between two electrons. 


a general theoretical sense by this approximation. If it 
is not made it is not easy to study interacting particles 
relativistically, for there is nothing significant in choos- 
ing ti=1, if xt#Xq, as absolute simultaneity of events 
at a distance cannot be defined invariantly, It is essen- 
tially to avoid this approximation that the complicated 
structure of the older quantum electrodynamics has 
been built up. We wish to describe electrodynamics as 
a delayed interaction between particles. If we can make 
the approximation of assuming a meaning to K (3,4; 1, 2) 
the results of this interaction can be expressed very 
simply. 

To see how this may be done, imagine first that the 
interaction is simply that given by a Coulomb potential 
e’/r where r is the distance between the particles. If this 
be turned on only for a very short time Afy at time to, 
the first order correction to A(3, 4; 1, 2) can be worked 
out exactly as was Eq, (9) of I by an obvious general- 
ization to two particles: 


K(3, 4,1 2)= ie f f Roal3, SRLS, Crea 
X Kal, 1) K ot (6, 2)d?xgd?xgAto, 


where /s;=/g=¢to. If now the potential were on at all 
times (so that strictly K is not defined unless (5= 4; and 
1, =¢s), the first-order effect is obtained by integrating 
on to, which we can write as an integral over both /5 
and /, if we include a delta-function 4(¢s—/s) to insure 
contribution only when /;=/s. Hence, the first-orcer 
effect of interaction is (calling /s—~/5=/ss): 


K(3, 451, 2)=~i8 ff Keal3, S)Kos(d, 6)ree 


X S(ts6) Kou(5, 1)Kon(6, 2)drsdre, (2) 


where dr=d'*xd/. 

We know, however, in classical electrodynamics, that 
the Coulomb potential does not act instantaneously, 
but is delayed by a time 755, taking the speed of light 
as unity. This suggests simply replacing rse6(/sc) in 
(2) by something like rsé~'6(tss—rs6) to represent the 
delay in the effect of b on a, 


FEYNMAN 


This turns out to be not quite right. for when this 
interaction is represented by photons they must be of 
only positive energy, while the Fourier transform of 
6(/ss— 156) Contains frequencies of both signs. It should 
instead be replaced by 64(/ss—~rse) where 


* (ri) 
a(a)= f eda mstim -- —-=8(x)+ (airy. (3) 
a 


e-0 vie 


This is to be averayed with ry67!'3y(—lss— 7568) which 
arises when /;</s aud corresponds to a emitting the 
quantum which 6 receives. Since 


(27) (6. 1-1) +5, (—t-1)) = 6, (FP), 


this means ry65(f55) is replaced by 4,.(sse2) where 
S36 = Ig57— Tac’ Is the square of tlte relativisucally in- 
variant interval between points 5 and 6, Since in 
classical electrodynamics there ts also an interaction 
through the vector potential, the camplete interaction 
(see A, Eq. (1)) should be (1— (v5: ¥6)84 (936), or in the 
relativistic Case, 


(1 ea+ 01) 54.(5567) = BuBoV an vu54(Ss0°)- 


Hence we have for electrons obeying the Dirac equation, 


KO, 4; 1, Je ie ip PBB, 31K WA, 67017 


XK G4 (55679 K a(S, DA 46(6, Dd rylr6, (4) 


where Yay and yo, are the Dirac matrices applying to 
the spinor corresponding to particles @ and b, respec- 
tively (the factor 8.8, being absorbed in the definition, 
I Eq. (17), of Ky). 

This is our fundamental equation for electrodynamics. 
It describes the effect of exchange of one quantum 
(therefore first order in ¢?) between two electrons. It 
will serve as a prototype enabling us to write down the 
corresponding quaatities involving the exchange of two 
or more quanta between (wo electrons or the interaction 
of an electron with itself. It is a consequence of con- 
ventional electrodynamics, Relativistic juvariance is 
clear. Since one sums over uw it contains the effects of 
both longitudinal and transverse waves in a relati- 
vistically symmetrical way. 

We shall now interpret Eq. (4) in a manner which 
will permit us to write down the higher order terms. It 
can be understood (sec Fig. 1) as saying that the ampli- 
tude for ‘‘e” to go from 1 to 3 and °b” to go from 2 to 4 
is altered to first order because they can exchange a 
quantum. Thus, “a’’ can go to 5 (amplitude A4(5, 1)) 


7Tt, and a like term for the elfect of a on b, leads to a theor 

which, in the classical timit, exhibits interaction through half- 
advanced and half-retarded potentials. Classically, this is equi- 
valen! to purely retarded effecis within a closed box from which 
no light escapes (e.g., see A, or J. A. Wheeler and K.P. Feynman, 
Rev, Mod, Phys, 17, 157 (1945)). Analogous theorems exist in 
quantum mechanics but it would lead us too far astray to discuss 
them now, 
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emit a quantum (longitudinal, transverse, or scalar 
Yes) and then proceed to 3 (K4(3, 5)). Meantime ‘b” 
goes to 6 (K,(6,2)), absorbs the quantum (y,) and 
proceeds to + (A,(4, 6)). The quantum meanwhile pro- 
ceeds from 5 to 6, which it does with amplitude 6,(55¢7). 
We must sum over all the possible quantum polariza- 
tions » and positions and times of emission 5, and of 
absorption 6, Actually if /5>/s it would be better to 
say that ‘a’ absorbs and ‘‘b” emits but no attention 
need be paid to these matters, as all such alternatives 
are automatically contained in (4). 

The correct terms of higher order in ée? or involving 
larger numbers of electrons (interacting with themselves 
or in pairs) can be written down by the same kind of 
reasoning. They will be illustrated by examples as we 
proceed. In a succeeding paper they will all be deduced 
from conventional quantum electrodynamics, 

Calculation, from (4), of the transition element be- 
tween positive energy free electron states gives the 
Moller scattering of two electrons, when account is 
taken of the Pauli principle. 

The exclusion principle for interacting charges is 
handled in exactly the same way as for non-interacting 
charges (I). For example, for two charges it requires 
only that one calculate K(3, 4; 1,2)—K(4, 3; 1,2) to 
get the net amplitude for arrival of charges at 3 and 4. 
It is disregarded in intermediate states. The inter- 
ference effects for scattering of electrons by positrons 
discussed by Bhabha wil] be seen to result directly in 
this forntulatiou. The formulas are interpreted to apply 
to positrons in the manner discussed in I. 

As our primary concern will be for processes in which 
the quanta are virtual we shall not include here the 
detailed analysis of processes involving real quanta in 
initial or final state, and shall content ourselves by only 
stating the rules applying to them.’ The result of the 
analysis is, as expected, thut they cau be included by 
the same line of reasoning as is used in discussing the 
virtual processes, provided the quantities are normalized 
in the usual manner to represent single quanta. For 
example, the amplitude that an electron in going from 1 
to 2 absorbs a quantum whose vector potential, suitably 
normalized, is ¢, exp(—ik-x)=C,(x) is just the expres- 
sion (I, Eq. (13)) for scattering in a potential with 
A (3) replaced by C (3). Each quantum interacts only 


"Although in the expressions stemming from (4) the quanta are 
virtual, this is not actually a theoretical limitation. One way to 
deduce the correct rules for real quanta from (4) is to note that 
in a closed system all quanta can be considered as virtual (ie., 
they have a known source and are eventually absorbed) so that 
in such a system the present description is complete and equiva- 
lent to the conventional one. In particular, the relation of the 
Einstein A and B coefficients can be (leduced. A more practical 
direct deduction of the expressions for real quanta will be given 
in the subsequent paper. tt might be noted that (4) can be re- 
written as describing the action on a, K(3,1)=if'K,(3, 5) 
x A(5)K4(5, drs of the potential A,(5)=eSK,(4, 6)8.(5s0 Yn 
XK4(6, 2)dre arising from Maxwell’s equations —O074,=4xj, 
from a “current” jy(6) =eK4(4, 6)yyKy(6, 2) produced by par- 
ticle 6 in going from 2 to 4. This is virtue of the fact that 8, 
satisfies 

— 05,(sa1?) = 48(2, 1). (5) 
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once (either in emission or in absorption), terms like 
(I, Eq. (14)) occur only when there is more than one 
quantum involved. The Bose statistics of the quanta 
can, in all cases, be disregarded in intermediate states. 
The only effect of the statistics is to change the weight 
of initial or final states, If there are among quanta, in 
the initial state, some # which are identical then the 
weight of the state is (1/!) of what it would be if these 
quanta were considered as different (similarly for the 
final state). 


3. THE SELF-ENERGY PROBLEM 


Having a term representing the mutual interaction 
of a pair of charges, we must include similar terms to 
represent the interaction of a charge with itself. For 
under some circumstances what appears to be two dis- 
tinct electrons may, according to I, be viewed also as 
a single electron (namely in case one electron was 
created in a pair with a positron destined to annihilate 
the other electron). Thus to the interaction between 
such electrons must correspond the possibility of the 
action of an electron on itself.° 

This interaction is the heart of the self energy prob- 
lem. Consider to first order in e’ the action of an electron 
on itself in an otherwise force free region. The amplitude 
K(2, 1) for a single particle to get from 1 to 2 differs 
from K,(2, 1) to first order in e? by a term 


K%(2, 1)=- ef [Ke 4)y,K4(4, 3) yy 


X K4(3, ldrsdr45, (S437). (6) 


It arises because the electron instead of going from 1 
directly to 2, may go (Fig. 2) first to 3, (K4(3, 1)), emit 
a quantum (y,), proceed to 4, (K,(4,3)), absorb it 
(y,), and finally arrive at 2 (K,(2,4)). The quantum 
must go from 3 to 4 (64(sas?)). 

This is related to the self-energy of a free electron in 
the following manner. Suppose initially, time h, we have 
an electron in state f(1) which we imagine to be a posi- 
tive energy solution of Dirac’s equation for a free par- 
ticle. After a long time f:—f the perturbation will alter 


Fic. 2. Interaction of an elec- 
tron with itself, Eq. (6). 


® These considerations make it appear unlikely that the con- 
tention of J. A. Wheeler and R. P. Feynman, Rev. Mod. Phys. 
17, 157 (1945), that electrons do not act on themselves, will be a 
successful concept in quantum electrodynamics. 
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the wave function, which can then be looked upon as 
a superposition of free particle solutions (actually it 
only contains f). The amplitude that g(2) is contained 
is calculated as in (I, Eq. (21)). The diagonal element 
(g=f) is therefore 


f irene, vasa a 


The time interval T =/>—¢, (and the spatial volume V 
over which one integrates) must be taken very large, 
for the expressions are only approximate (analogous to 
the situation for two interacting charges). This is 
because, for example, we are dealing incorrectly with 
quanta emitted just before #2 which would normally be 
reabsorbed at times after fo. 

If A™(2, 1) from (6) is actually substituted into (7) 
the surface integrals can be performed as was done in 
obtaining I, Iq. (22) resulting in 


~ié f [IDK I 1f@¥lutdratrs. 8) 


Putting for f(1) the plane wave 1 exp(—ip-ai) where 
pu is the energy (ps) and momentum of the electron 
(p?=m*), and u is a constant 4-index symbol, (8) 
becomes 


—iet ff (ark (4, 3)yo0) 


Kexp(tp- (xy aa) OCP d rad 74, 


the integrals extending over the volume V and time 
interval T. Since K,(4, 3) depends only on the difference 
of the coordinates of 4 and 3, 243,, the integral on 4 
gives a result (except near the surfaces of the region) 
independent of 3. When integrated on 3, therefore, the 
result is of order VT. The effect is proportional to V, 
for the wave functions have been normalized to unit 


MOMENTUM p-k, MOMENTUM k, 
FACTOR (p-k- FACTOR k-2 
INTERACTION Ge 
MOMENTUM p 


Fic. 3. Interaction of an electron with itself. 
Momentum space, Eq. (11). 
© This is discussed in reference 5 in which it is pointed out that 
the concept of a wave function loses accuracy if there are delayed 
self-actions. 
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volume. If normatized to volume V, the result would 
simply be proportional to T. This is expectedl, for if the 
effect were equivalent to a change in energy AE, the 
amplitude for arrival in f at fy is altered by a factor 
exp(—7AE(t2—4))), or to first order by the difference 
—1(AE)T. Hence, we have 


sae f (ayek, (4, 3) yut) exp(ip vis), (s132)d71, (9) 


integrated over all space-time dr,. This expression will 
be simplified presently. In interpreting (9) we have 
tacitly assumed that the wave functions are normalized 
so that (a*1) = (@yatt)=1. The equation may therefore 
be made independent of the narmalization by writing 
the left side as (AE)(ty.2), or since (ys) = (E/m) (01) 
and mAm= EAE, as Am( ti) where Am is an equivalent 
change in mass of the electron. In this form invariance 
is obvious. 

One can likewise obtain an expression for the energy 
shift for an electron in a hydrogen atom. Simply replace 
K, in (8), by K,"), the exact kernel for an electron in 
the potential, V= e'/r, of the atom, and f by a wave 
function (of space and time) for an atomic state. In 
general the AE which results is not real. The imaginary 
part is negative and in exp(—?AET) produces an ex- 
ponentially decreasing amplitude with time. This is 
because we are asking for the amplitude that an atom 
initially with no photon in the field, will still appear 
after time T with no photon. If the atom is in a state 
which can radiate, this amplitude must decay with 
time. The imaginary part of AE when calculated does 
indeed give the correct rate of radiation from atomic 
states. It is zero for the ground state and for a free 
electron, 

In the non-relativistic region the expression for AE 
can be worked out as lias been done by Bethe." In the 
relativistic region (points + and 3 as close together as a 
Compton wave-length) the Ay“ which should appear 
in (&) can be replaced to first order in V by Ky plus 
K,.(2, 1) given in I, Eq. (13). The problem is then 
very similar to the radiationless scattering problem 
discussed below, 


4. EXPRESSION IN MOMENTUM AND 
ENERGY SPACE 
The evaluation of (9), as well as all the other more 
complicated expressions arising in these problems, is 
very much simplified by working in the momentum and 
energy variables, rather than space and time. For this 
we shall need the Fourier Transform of 64(sa1) which is 


— 54.(52?) = rif exp(—tk-x2)k7d‘k, (10) 


which can be obtained from (3) and (5) or from I, 


Eq. (32) noting that 74(2, 1) for m?=0 is 6,(so.?) from 


"H. A. Bethe, Phys. Rey. 72, 339 (1947). 
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hie. 4. Rakitive enrreetion to scattering, momentum space. 


I, Eq. (34). The &-? means (&-&)~! or more precisely 
the lintit as 50 of (k-&4+15)~'. Further dk means 
(Qry*dkydRodkydky. If we imagine that quanta are par- 
ticles of zero mass, then we can make the general rule 
that all poles are to be resolved by considering the 
masses of the particles aud quanta to have infinitesimal! 
negative imayinary parts. 

Using these results we see that the self-energy (9) is 
the matrix clement between w@ and a of the matrix 


(e/ni) f u(b—k-m)y"y,k"de, (11) 


where we have used the expression (I, Eq. (31)) for the 
Fourier trausform of K,. This form for the self-energy 
is easier to work with than is (9). 

The equation can be understood by imagining (Fig. 3) 
that the electrou of momentum p emits (y,) a quantum 
of monientum k, aud makes its way now with mo- 
nientum p—R to the uext event (factor (p—k—m)7) 
which is to absorb the quantum (another y,). The 
amplitude of propagation of quanta is k~*. (There is a 
factor ¢?/mi for each virtual quautum). One integrates 
over all quanta. The reason an electron of momentum p 
propagates as 1/(p—) ts that this operator is the re- 
ciprocal of the Dirac equation operator, and we are 
simply solving this equation. Likewise light goes as 
1/F’, for this is the reciprocal D’Alembertian operator 
of the wave equation of light. The first y, represents 
the current which generates the vector potential, while 
the second is the velocity operator by which this poten- 
tial is multiplied in the Dirac equation when an external 
Neld acts on ati electron, 

Using the same line of reasoning, other problems may 
be set up directly in momentttm space. For example, 
consider the scattering in a potential A=A,y, varying 
in space and time as @ exp(—ig-x). An electron initially 
in state of momentum pi=fizyu will be deflected to 
state p. where po=pitg. The zero-order answer is 
simply the matrix element of a between states 1 and 2. 
We next ask for the first order (in e) radiative correc- 
tion due to virtual radiation of one quantum, There are 
several ways this can happen. First for the case illus- 
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Fic. 5. Compton scattering, Eq. (15). 


trated in Fig. 4(a), find the matrix: 
(2/3) f lb b— my"alp.— myth (12) 


For in this case, first’ a quantum of momentum & is 
emitted (y,), the electron then having momentum 
pi-R and hence propagating with factor (pi— k—m)7}. 
Next it is scattered by the potential (matrix a) receiving 
additional momentum gq, propagating on then (factor 
(po— kR—m)~') with the new momentum until the quan- 
tum is reabsorbed (y,). The quantum propagates from 
emission to absorption (kX?) and we integrate over all 
quauta (d‘k), and sum on polarization ». When this is 
integrated on k4, the result can be shown to be exactly 
equal to the expressions (16) and (17) given in B for 
tle same process, the various terms coming from resi- 
dues of the poles of the integrand (12). 

Or again if the quantum is both emitted and re 
absorbed before the scattering takes place one finds 


(Fig. 4(b)) 
(e/ni) f a(p,—m)-!y(Pi— R—m) "yk, (13) 


or if both emission and absorption occur after the 
scattering, (Fig. 4(c)) 


(2/ ni) { ¥,(Po— k—m)~y,(po— mytak-tdth, (14) 


These terms are discussed in detail below. 

We have now achieved our simplification of the form 
of writing matrix elements arising from virtual proc- 
esses. Processes in which a number of real quanta is 
given initially and finally offer no problem (assuming 
correct normalization). For example, consider the 
Compton effect (ig. 5(a)) in which an electron in state 
pi absorbs a quantum of momentum q, polarization 
vector ¢, So that its interaction is ey,y,=@1, and emits 
a second quantum of momentum — qz, polarization ez 
to arrive in final state of momentum pe. The matrix for 

® First, next, ctc., here refer not to the order jn true time but to 
the succession of events along the trajectory of the electron. That 


is, more precisely, to the order of appearance of the matrices in 
the expressions. 
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this process is @2(f1+@,—)"'e;. The total matrix for 
the Compton effect is, then, 


elit gi-m)terte(pitge—m) te, (15) 


the second term arising because the emission of e2 may 
also precede the absorption of e; (Fig. 5(b)). One takes 
matrix elements of this between initial and final electron 
states (bit+¢gi=p2—q2), to obtain the Klein Nishina 
formula. Pair annihilation with emission of two quanta, 
etc., are given by the same matrix, positron states being 
those with negative time Component of ~. Whether 
quanta are absorbed or emitted depends on whether the 
time component of g is positive or negative. 


5. THE CONVERGENCE OF PROCESSES WITH 
VIRTUAL QUANTA 


These expressions are, as has been indicated, no more 
than a re-expression of conventional quantum electro- 
dynamics. As a consequence, many of them are mean- 
ingless. For example, the self-energy expression (9) or 
(11) gives an infinite result when evaluated, The infinity 
arises, apparently, from the coincidence of the 6-function 
singularities in K4.(4, 3) and 6,(s.45?). Only at this point 
is it necessary to make a real departure from conven- 
tional electrodynamics, a departure other than simply 
rewriting expressions in a simpler form. 

We desire to make a modification of quantum electro- 
dynamics analogous to the modification of classical 
electrodynamics described in a previous article, A. 
There the 6(s}y*) appearing in the action of interaction 
was replacect by f(sj2") where f(*) is a function of smalt 
width and great height. 

The obvious corresponding modification in the quan- 
tum theory is to replace the 6,(s?) appearing the 
quantum inechanical interaction by a new function 
f,(s*), We can postulate that if the Fourier trans- 
form of the classical f(s122) is the integral over all & of 
F(R’) exp(—ik-sy:)d'k, then the Fourier transform of 
{+(s*) is the same integral taken over only positive fre- 
quencies ky for 12>; and over only negative ones for 
to<t, in anatogy to the relation of 64(s?} to 6(s*), The 
function f(s?}= f(x-x) can be written* as 


fe-ay=(dmy-* ff sin(tal al) 

kino 

Xcos(K-x)dkid Kg (k- 2), 
where g(k-) is kaw! times the density of oscillators and 
may be expressed for positive ky as (A, Eq. (16)) 
gce)= f (808) a —A))GIX)aD, 

0 
where fo*G(A)d\=1 and G involves values of \ large 
compared to m. This simply means that the amplitude 


* This relation is given incorrectly in A, equation just pre- 
ceding 16. 
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for propagation of quanta of momentum & is 
= FyGe mr f “FAM NGONA, 
0 
rather than k~. That is, writing 7,(2?)= —a7'R°C(k’), 
= filout=a f exp(—ik-x12)R-°C (2 )d4k. (16) 


Every integral over an intermediate quantum which 
previously involved a factor d'k/R? is now supplied with 
a convergence factor C(k’) where 


n 


coe)= f =A R= N)IG(AYAA. (17) 
0 


The poles are defined by replacing k? by k?+76 in the 
limit 60. That is \* may be assumed to have an infini- 
tesimal negative imaginary part. 

The function f,(s1.°) may still have a discontinuity 
in vatue on the light cone, This is of no influence for the 
Dirac electron. For a particle satisfying the Klein 
Gordon equation, however, the interaction involves 
gracients of the potential which reinstates the 6 func- 
tion if f has discontinuities. The condition that f is to 
have no discontinuity in value on the ight cone implies 
RC(k’) approaches zero as R* approaches infinity. In 
terms of G(A) the condition is 


f NG(A)dN= 0. (18) 
0 


This condition will also be used in discussing the con- 
vergence of vacuum polarization integrals. 
The expression for the self-energy matrix is now 


(e/a) fel B— R= my ECU, (19) 


which, since C(R’) fatls off at least as rapidly as 1/h’, 
converges. For practical purposes we shall suppose 
hereafter that C(k') is simply —A?/(R?—A*) implying 
that some average (with weight G(A)d¢A) over values of 
\ may be taken afterwards. Since in alt processes the 
quantum momentum wilt be contained in at least one 
extra factor of the form (p—k—»)"' representing 
propagation of an electron while that quantum is in 
the field, we can expect all such integrals with their 
convergence factors to converge and that the result of 
alt such processes will now be finite and deiinite (ex- 
cepting the pracesses with clased loops, discussed below, 
in which the diverging integrals are over the momenta 
of the electrons rather than the quanta). 

The integral of (19) with C(k*)= —A*(k?— A’) noting 
that p?= om, \=>m and dropping terms of order om/A, 
is (see Appendix A) 


(€2/2m)[-Am(n(A/m)-+ 3) — plin(d/m)+3/4)]. (20) 
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When applied to a state of an electron of momentum p 
satisfying pu=mu, it gives for the change in mass (as 
in B, Eq. (9)) 


Am=m(e?/2m) (3 In(A/m)+4). (21) 


6. RADIATIVE CORRECTIONS TO SCATTERING 


We can now complete the discussion of the radiative 
corrections to scattering. In the integrals we include the 
convergence factor C(k*), so that they converge for 
large k. Integral (12) is also not convergent because of 
the well-known infra-red catastrophy. For this reason 
we calculate (as discussed in B) the vatue of the integral 
assuming the photons to have a smalt mass Amin&m<A. 
The integral (12) becomes 


(€/ ri) fel Be mya p= b= mn) 


Xyu(RE— Ania) AAC (KE Vanin’), 


which when integrated (see Appendix B) gives (e?/2m) 
times 


m 20 
[2(i= -1)(1- ) +0 ane 
Amin tan20 


4 e 
+ f a tana a 
tan2@ J 
1 20 
+—(qa— aq) tra, (22) 
4m sin26 


where (q*)'= 2m sin@ and we have assumed the matrix to 
operate between states of momentum p; and p.= ~it+¢ 
and have neglected terms of order Nnun/mt, m/d, and 
q@/N. Here the onty dependence on the convergence 


factor is in the term ra, where 
r= tn(A/m)+ 9/4— 2 In(m/Xunin)- (23) 


As we shall see in a moment, the other terms (13), 
(14) give contributions which just cancel the ra term. 
The remaining terms give for small q, 


; i +g? ( mt *) ) (24) 
4 a att = . 
a (5-4 OO ae Pan 8 


which shows the change in magnetic moment and the 
Lamb shift as interpreted in more detail in B.* 


3 That the result given In B in Eq. (19) was in error was re- 
poe pointed out to the author, in private communication, 
y V. F. Weisskopf and J. B. French, as their calculation, com- 
pleted simultaneously with the author’s early in 1948, gave a 
different result. French has finally shown that although the ex- 
pression for the radiationless scattering B, Eq. (18) or (24) above 
is correct, it was incorrectly joined onto Bethe’s non-relativistic 
result. He shows that the relation In2&max— | =inAmin used by the 
author should have been In2émax—5/6=|nAmia. This results in 
adding a term — (1/6) to the logarithm In B, Eq. (19) so that the 
result now agrees with that of J. B. French and V. F. Weisskopf, 


We must now study the remaining terms (13) and 
(14). The integral on & in (13) can be performed (after 
multiplication by C(k*)) since it involves nothing but 
the integral (19) for the self-energy and the result is 
allowed to operate on the initial state #1, (so that 
piu;= mu). Hence the factor fottowing a(p1—m)~ will 
be just Am. But, if one now tries to expand 1/(pi—m) 
=(pitm)/(p:?—m*) one obtains an infinite result, 
since p= m?. This is, however, just what is expected 
physically. For the quantum can be emitted and ab- 
sorbed at any time previous to the scattering. Such a 
process has the effect of a change in mass of the electron 
in the state 1. It therefore changes the energy by AE 
and the amplitude to first order in AE by —iAE-t where 
tis the time it is acting, which is infinite. That is, the 
major effect of this term would be canceled by the effect 
of change of mass Am. 

The situation can be analyzed in the following 
manner. We suppose that the electron approaching the 
scattering potential @ has not been free for an infinite 
time, but at some time far past suffered a scattering by 
a potential b. If we limit our discussion to the effects 
of Am and of the virtual radiation of one quantum be- 
tween two such scatterings each of the effects will be 
finite, though large, and their difference is determinate. 
The propagation from 6 to a is represented by a matrix 


a(p’—m)~'b, (25) 


in which one is to integrate possibly over p’ (depending 
on details of the situation). (If the time is tong between 
6 and a, the energy is very nearly determined so that 
p” is very nearly m?.) 

We shall compare the effect on the matrix (25) of the 
virtual quanta and of the change of mass Am. The effect 
of a virtual quantum is 


(e/ni) f ap’ m)~'y,(p'— k— mm)? 


Xyulp'— my bk *d'kC(R*), (26) 
while that of a change of mass can be written 
a(p’—m)7Am(p'—m)-'b, (27) 


and we are interested in the difference (26)-(27). A 
simple and direct method of making this comparison is 
just to evaluate the integral on & in (26) and subtract 
from the result the expression (27) where Am is given 
in (21). The remainder can be expressed as a multiple 
—r(p”) of the unperturbed amplitude (25); 


—1(p")a(p'— m)~b. (28) 


This has the same result (to this order) as replacing 
the potentials a and b in (25) by (1—4r(p”))a@ and 


Phys. Rev. 75, 1240 (1949) and N. H. Kroll and W. E. Lamb, 
Phys. Rev. 75, 388 (1949). The author fecis unhappily responsible 
for the very considerable delay in the publication of French’s 
result occasioned by this error. This footnote Is appropriately 
numbered. 
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(1—47r(p"))b. In the limit, then, as pn? the net 
effect on the scattering is —4ra where r, the limit of 
r(p”) as p"—m? (assuming the integrals have an infra- 
red cut-off), turns out to be just equal to that given in 
(23). An equal term —}3ra arises from virtual transitions 
after the scattering (14) so that the entire ra term in 
(22) is canceled. 

The reason that 7 is just the vatue of (12) when g?=0 
can also be seen without a direct calculation as fottows: 
Let us call p the vector of length m in the direction of 
b’ so that if p= m(14+-€)? we have p’= (1+ .)p and we 
take € as very small, being of order T—' where T is the 
time between the scatterings b and a. Since (p’—m)~! 
= (p'+m)/(p?— mm?) = (p+m)/2me, the quantity (25) 
is of order e~' or 7. We shall compute corrections to it 
only to its own order (e+) in the limit 30. The term 
(27) can be written approximately"! as 


(emi) f a(p'—m)-4(9- k-m)" 
X yu(p' — m) ORCC R’), 


using the expression (19) for Am. The net of the two 
effects is therefore approximately’ 


= (emi) { atp'— ml k—m)ep(p—k—m) 
X y4(p!— Im) Rd RC (RY), 


a term now of order I/e (since (p!—m)-'=(p+-m) 
(2m?e)~!) and therefore the one desired in the limit. 
Comparison to (28) gives for r the expression 


(r+ m/2m) f ral Pr— R=) pyn)(p)— k—my 


X yk WskC(R). (29) 


The integral can be immediately evaluated, since it 
is the same as the integral (12), but with g=0, for a 
replaced by pi/m. The result is therefore r-(pi/m) 
which when acting on the state 24; is just 7, as pyti= muy. 
For the same reason the term (f:+)/2m in (29) is 
effectively 1 and we are left with —r of {23).!6 

In more complex problems starting with a free elec- 


4 The expression is not exact because the substitution of Am 
by the integral in (19) is valid only if p operates on a state such 
that p can be replaced hy m. The error, however, is of order 
a(p'—m)"\(p—m)(p'—m)~b which is a((1+e)p+m)(p— m) 
X (+e) ptm) pQe+e)?m™. But since p? =m’, we have p(p—m) 
=—m(p-m)=(p—m)p so the net result is approximately 
a(p—m)b/4n2 and is not of order 1/e but snialler, so that its effect 
drops out in the limit. 

'S We have used, to first order, the general expansion (valid for 
any operators A, B) 


(4 4B) = Ale AP BATE AI BAOB t= - 


with A =p—k—m and B=p'!—p=ep to expand the difference of 
(p'— k— m)-1 and (p-—- k—- mi), 

¢ The renormalization terms appearing B, Eqs. (14), (15) when 
translated directly ito the present notation do not give twice 
(29) but give this expression with the cctitral py! factor replaced 
by myi/A, where Ei= pry for «=4, Wher integrated it therefore 
gives ra((pitm)/2m)(myi/E:) or ra—ra(mys/E\)(pi- m) /2m. 
(Since privet vil: =2E,) which gives just ra, since prt = mui. 
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tron the same type of term arises from the effects of a 
virtual emission and absorption both previous ta the 
other processes. They, therefore, simply lead to the 
same factor 7 so that the expression (23) mav be used 
directly and these renormalization integrals need not 
be computed afresh for each problem, 

In this problem of the radiative correctinns lo scatter- 
ing the net result is insensitive to the cut-off. This 
means, of course, that by a simple rearrangement of 
terms previous to the integration we could have avoided 
the use of the convergence factors completely (see for 
example Lewis"). The problem was solved in the 
maimer here in order to illustrate how the use of such 
convergence factors, even when they are actually un- 
necessary, may facilitate analysis somewhat by remov- 
ing the effort and ambiguities that may he involved in 
trying to rearrange the otherwise clivergent terms. 

The replacement of 6, by f, given in (16), (17) is 
not deterinined by the analogy with the classical prob- 
lem. In the classical limit only the real part of 4, (ie, 
just 6) is easy to interpret. But by what should the 
imaginary part, 1/(ais?), of 64. be replaced? The choice 
we have made here (in defining, as we have, the location 
of the poles of (17)) is arbitrary and almost certainly 
incorrect. If the radiation resistance is calculated for 
an atom, as the imaginary part of (8), the result de- 
pends sliglitly on the function f,, On the other hand the 
light radiated at very large distances from a source is 
independent of f,. The total energy absorbed by distant 
absorbers will not check with the energy loss of the 
source. We are in a situation analogous to that in the 
classical theory if the entire f function is made to 
contain only retarded contributions (see A, Appendix), 
One desires instead the analogue of (F)re¢ of A. This 
problem is being studied. 

One can say therefore, that this attempt to find a 
consistent modification of quantum electrodynamics ts 
incomplete (see also the question of closed loops, below). 
For it could turn out that any correct form of f, which 
will guarantee energy Conservation may at the same 
time not be able to make the self-energy integral finite. 
The desire to make the methods of simplifying the 
calculation of quantum electrodynamic processes more 
widely available has prompted this publication before 
an analysis of the correct form for f, is complete. One 
might try to take the position that, since the energy 
discrepancies discussed vanish in the limit A>, the 
correct physics might be considered to be that obtained 
by letting A270 after mass renormalization. I have no 
proof of the mathematical conststency of this procedure, 
but the presumption is very strong that it is satisfac- 
tory. (It is also strong that a satisfactory form for f, 
can be founcl.) 


7. THE PROBLEM OF VACUUM POLARIZATION 
In the analysis of the radiative corrections to scatter- 
ing one type of term was not considered. The potential 


11H, W. Lewis, Phys. Rev. 73, 173 (1948). 
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which we can assume to vary as @, exp(—ig-x) creates 
a pair of electrons (see Fig. 6), momenta pa, — po. This 
pair then reannihilates, emitting a quantum q= py— pa, 
which quantum scatters the original electron from state 
1 to state 2. The matrix element for this process (and 
the others which can be obtained by rearranging the 
order in time of the vartous events) is 


= (€/niN aera) f SP Bta~m 


Xy(Pa- m)~!y, Jd pagrC (Pav. 


This is because the potential produces the pair with 
amplitude proportional to a,y,, the electrons of mo- 
menta p, and —(~,+q) proceed from there to annthi- 
late, producing a quantum (factor y,) which propagates 
(factor g~°C(q*)) over to the other electron, by which 
it is absorbed (matrix element of y, between states 1 
and 2 of the original electron (isy,zt,)). All momenta po 
and spin states of tle virtual electron are admitted, 
which means the spur and the integral on d‘p, are 
calculated. 

One can imagine that the closed loop path of the 
positron-electron produces a current 


(30) 


4nju=Jwe,, (31) 


which is the source of the quanta which act on the 
second electron. The quantity 


Sus —(@/xi) f splie-+q—m)* 


Xy(p—-m)yty,]d'p, (32) 
is then characteristic for this problem of polarization 
of the vacuum. 

One sees at once that J,, diverges badly. The modifi- 
cation of 6 to f alters the amplitude with which the 
current j, will affect the scattered electron, but it can 
do nothing to prevent the divergence of the integral (32) 
and of its effects. 

One way to avoid such difficulties is apparent. rom 
one point of view we are considering all routes by which 
a given electron can get from one region of space-time 
to another, i.e., from the source of electrons to the 
apparatus which measures them. From this point of 
view the closed loop path leading to (32) is unnatural. 
It might be assumed that the only paths of meaning are 
those which start from the source and work their way 
in a continuous path (possibly containing many time 
reversals) to the detector. Closed loops would be ex- 
Cluded. We have already found that this may be done 
for electrons moving in a fixed potential. 

Such a suggestion must meet several questions, how- 
ever. The closed loops are a consequence of the usual 
hole theory in electrodynamics. Among other things, 
they are required to keep probability conserved. The 
probability that no pair is produced by a potential is 
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Fic. 6. Vacuuin polarization ef- 
fect on scattering, Eq. (30). 


not unity and its deviation from unity arises from the 
imaginary part of J,,. Again, with closed loops ex- 
cluded, a pair of electrons once created cannot annihi- 
late one another again, the scattering of light by light 
would be zero, etc. Although we are not experimentally 
sure of these phenomena, this does seem to indicate 
that the closed [oops are necessary. To be sure, it is 
always possible that these matters of probability con- 
servation, etc., will work themselves out as simply in 
the case of interacting particles as for those in a fixed 
potential, Lacking such a demonstration the presump- 
tion is that the difficulties of vacuum polarization are 
not so easily circumvented. 

An alternative procedure discussed in B is to assume 
that the function K,(2, 1) used above is incorrect and 
is to be replaced by a modified function K,’ having no 
singularity on the light cone, The effect of this is to 
Provide a convergence factor C(p?—m*) for every inte- 
gral over electron momenta.” This will multiply the 
integrand of (32) by C(p?— m?)C((p+q)’— m?), since the 
integral was originally 5(pa—pst+q)d'pad*p, and both 
pe and py get convergence factors. The integral now 
converges but the result is unsatisfactory.” 

One expects the current (31) to be conserved, that is 
Quju=O0 or QuJy»=0. Also one expects no current if a, 
is a gradient, or a,=q, times a constant. This leads to 
the condition J,,g,=0 which is equivalent to qJy=0 
since J,, is symmetrical. But when the expression (32) 
is integrated with such convergence factors it does not 
satisfy this condition. By altering the kernel from K to 
another, K’, which does not satisfy the Dirac equation 
we have lost the gauge invariance, its consequent cur- 
rent Conservation and the general consistency of the 
theory. 

One can see this best by calculating J,..g, directly 
from (32). The expression within the spur becomes 
(p+q—m)"'q(p—m)y, which can be written as the 
difference of two terms: (p—11)~'y,—(p+q—m)~'y,. 
Each of these terms would give the same result if the 
integration d‘p were without a convergence factor, for 


Tt would be very interesting to calculate the Lamb shift 
accurately enough to be sure that the 20 megacycles expected 
from vacuum polarization are actually present. 

19 This technique also makes self-energy and radiationless scat- 
tering integrals finite even without the modification of 6, to f, for 
the radiation (and the consequent convergence factor C(x?) for 
the quanta). See B. 

20 Added to the terms given below (33) there is a term 
£08 2u?+ 3g?) buy for CCR) = — (R?—?)~1, which is not gauge 
invariant. (In addition the charge renormalization has —7/6 added 
to the logarithm.) 
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the first can be converted into the second by a shift of 
the origin of p, namely p’=p+q. This does not result 
in cancelation in (32) however, for the convergence 
factor is altered by the substitution. 

A method of making (32) convergent without spoiling 
the gauge invariance has been found by Bethe and by 
Pauli. The convergence factor for light can be looked 
upon as the result of superposition of the effects of 
quanta of various masses (some contributing nega- 
tively). Likewise if we take the factor C(p?—m”) 
=—d(p?—m?—d)- so that) «=(p?—m?)“'C(p?— mi") 
= (p?— m?)-'— (p?— m?— ”)-! we are taking the differ- 
ence of the result for electrons of mass » and mass 
(A?-++n")?, But we have taken this difference for cach 
propagation between interactions with photons. They 
suggest instead that once created with a certain mass 
the electron should continue to propagate with this 
mass through all the potential interactions until it 
closes its loop. That is if the quantity (32), integrated 
over some finite range of p, is callecl J,,(m?) and the 
corresponding quantity over the same range of p, but 
with m replacecl by (9?-++d?)! is J,,(m?-++ A") we should 
calculate 


J,P= f LF a(mn?)— Tyo W)IGAA, (32’) 
0 


the function G(A) satisfying f*G(A)dA=1 and 
So*G(A)NdA = 0, Then in the expression for J,,? the 
range of p integration can be extended to infinity as the 
integral now converges. The result of the integration 
using this method is the integral on dA over G(A) of 
(see Appendix C) 


e 1 » 
I wv? = — (Gude 5 y+") ( lead Ut ae 
T 3 om 


4m? +2q° 9 1 
LESH 
3g? tane/ 9 


with q?= 4n? sin’, 

The gauge invariance is clear, since gu(qugv— Gd») = 0. 
Operating (as it always will) on a potential of zero 
divergence the (q,9,—6,.9°)a, is simply —@a,, the 
D’Alembertian of the potential, that is, the current pro- 
ducing the potential. The term — }(In(A2/m?))(g,q 
qd») therefore gives a current proportional to the 
current producing the potential, This would have the 
same effect as a change in charge, so that we would have 
a difference A(e?) between e and the experimen- 
tally observed charge, e’-+ A(e?), analogous to the dif- 
ference between m and the observed mass. This charge 
depends logarithmically on the cut-off, A(e?)/e?= 
— (2e?/37) In(A/m). After this renormalization of charge 
is made, no effects will be sensitive to the cut-off. 

After this is doue the final term remaining in (33), 
contains the usual effects”! of polarization of the vacuum, 


aK, A. Uehling, Phys. Rev. 48, 55 (1935), R, Serber, Phys. 
Rev. 48, 49 (1933). 
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It is zero for a free light quantum (g?=0). For small q? 
it behaves as (2/15)? (adding —+ to the logarithm in 
the Lamb effect). For g?>(2m)? it is complex, the 
imaginary part representing the loss in amplitude re- 
quired by the fact that the probability that no quanta 
are produced by a potential able to produce pairs 
((q*)*>2nt) decreases with time. (To make the neces- 
sary analytic continuation, imagine m to have a small 
negative imaginary part, so that (1—q?/4m?)! becomes 
—i(g?/4m?—1)) as g? goes from below to above 4m’. 
Then 0=27/2+iu where sinhu=+(qg?/4m’—1)}, and 
—1/tané =? tanhu=+1(g?—4m")3(g’)-).) 

Closed loops containing a number of quanta or poten- 
tial interactions larger than two produce no trouble. 
Any loop with an odd number of interactions gives zero 
(I, reference 9). Four or more potential interactions give 
integrals which are convergent even without a con- 
vergence factor as is well kuown. The situation is 
analogous to that for self-energy. Once the simple 
problem of a single closed loop is solved there are 
no further divergence ditfeulties for more complex 
processes." 


8. LONGITUDINAL WAVES 


In the usual form of quantum electrodynamics the 
longitudinal and transverse waves are given separate 
treatment, Alternately the con ition (d.1,/dx,4) ¥ =!) is 
carried along as a supplementary conJition. In the 
present form ho such special considerations are ueces- 
sary for we are dealing with the solutions of the equation 
-LL,=47j, with a current 7, which is conserved 
0j,/0x,=0, That means at least {7]'(0.1,/0x,) =0 and 
in fact our solution also satislies d.{,/0c,= 0. 

To show that this is the cause we consider the ampli- 
tude for emission (real or virtual) of a photon and show 
that the divergence of this amplitude vanishes. The 
amplitude for emission for photons polarized in the pz 
direction involves matrix elements of y,. Therefore 
what we have to show is that the corresponding matrix 
elements of gyyu=q vanish. For example, for a first 
order effect we would require the matrix element of q 
between two states ~, and p.=pitg. But since 
Q=p:— pr and (hopitt,) = me( tyr) = (oprit,) the matrix 
element vanishes, which proves the contentiou in this 
case. It also vanishes in more complex situations (essen- 
tially because of relation (34), below) (for example, try 
putting @,=@2 in the matrix (15) for the Compton 
Effect). 

To prove this in general, suppose a,,7=1 to .V are a 
set of plane wave disturhimg potentials carrying mo- 
menta q, (e.g., some may be emissions or absorptions of 
the same or different quanta) and consider a matrix for 
the transition from a state of momentum py to py such 

2 There are loops completely without external interactions. Por 
example, a pair is created virtually along with a photon. Next they 
annihilate, absorbing this photon. Such loops are clisregarded on 
the grounds that they do not interact with apything and are 


thereby completely unobservable. Any indirect effects they may 
have via the exclusion principle have already beei included. 
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as ay J] i147! (pi—m) 1a; where p:= p:_1+@, (and in the 
product, terms with larger i are written to the left). 
The most general matrix element is simply a linear 
combination of these. Next consider the matrix be- 
tween states p) and py+q in a situation in which not 
only are the a; acting but also another potential 
a exp(—1q-x) where a= q, This may act previoustoalla;, 
in which case it gives an] [(pit+q—m)~'a.(pot g—m)'q¢ 
which is equivalent to +ay][(p.¢-q—m)—'a, since 
+(pbotq-—m)-'g is equivalent to (potq—m)-! 
X(po+q—m) as po is equivalent to m acting on the 
initial state. Likewise if it acts after all the potentials 
it gives q(py —m) an] (p:— m)~"a; which is equivalent 
to — ay] [(p,—m)-!a, since py-+q—m gives zero on the 
final state. Or again it may act between the potential 
a, and @,,; for each k. This gives 


V-1 N-t 
LX an IT (.4+9-m)a(pi+q-m) 
ko vehEL 
kt 
XQ(pi-— m)ax TT (p,—m)-a;, 
mt 
However, 


(Pit q—m)-'q(pi.— m)-* 
= (pr—m)"'—(petg-m) |, (34) 


so that the sum breaks into the difference of two sums, 
the first of which may be converted to the other by the 
replacement of k by R—1, There remain only the terms 
from the ends of the range of summation, 


N-1 N-1 
tay T] (b.—m)-!a,- ay Il (pbitq—m)-a;. 
vt = 

These cancel the tivo terms originally discussed so that 
the entire effect is zero. Hence any wave emitted will 
satisty 0A,/dx,=0. Likewise longitudinal waves (that 
is, waves for which A,=0¢/dx, or a@=q) cannot be 
absorbed and will have no effect, for the matrix ele- 
ments for emission and absorption are similar. (We 
have said little more than that a potential Ay=0y/dx, 
has no effect on a Dirac electron since a transformation 
y= exp(—ip)y removes it. It is also easy to see in 
coordinate representation using integrations by parts.) 

This has a useful practical consequence in that in 
computing probabilities for transition for unpolarized 
light one can sum the squared matrix over all four 
directions rather than just the two special polarization 
vectors. Thus suppose the matrix element for some 
process for light polarized in direction e, is e,M,. If the 
light has wave vector g, we know from the argument 
above that 9,M@,=0. For unpolarized light progress- 
ing in the z direction we would ordinarily calculate 
M?+- M7. But wecanas wellsum M24+M/74+M2-M? 
for q,M, implies M.= M, since q,=q, for free quanta. 
This shows that unpolarized light is a relativistically 
invariant concept, and permits some simplification in 
computing cross sections for such light. 


Incidentally, the virtual quanta interact through 
terms like y,-+-y,27d*k. Real processes correspond to 
Poles in the formulae for virtual processes, The pole 
occurs when k?=0, but it looks at first as though in the 
sum on all four values of yu, of yu:++7, we would have 
four kinds of polarization instead of two. Now it is clear 
that only two perpendicular to & are effective. 

The usual elimination of longitudinal and scalar vir- 
tual photons (leading to an instantaneous Coulomb 
potential) can of course be performed here too (although 
it is not particularly useful), A typical term in a virtual 
transition is yy:+:y,kd*k where the represent 
some intervening matrices. Let us choose for the values 
of u, the time #, the direction of vector part K, of R, 
and two perpendicular directions 1, 2, We shall not 
change the expression for these two 1, 2 for these are 
represented by transverse quanta, But we must find 
(ye oy— (yg tyK). Now Rk=kyyi-Kyx, where 
K=(K-K)}, and we have shown above that & replacing 
the y, gives zero.3 Hence Kyx is equivalent to kay, and 


(ye 1) — Cr 1K) = (KE B)/K) (Ye 70); 


so that on multiplying by k-°d*k= d'k(ka— K*)-' the net 
effect is —(y,:++y.)d'k/K*. The y; means just scalar 
waves, that is, potentials produced by charge density. 
The fact that 1/K? does not contain &, means that ky 
can be integrated first, resulting in an instantaneous 
interaction, and the d'K/K? is just the momentum 
representation of the Coulomb potential, 1/r. 


9. KLEIN GORDON EQUATION 


The methods may be readily extended to particles of 
spin zero satisfying the Klein Gordon equation,* 


Cy — mp = i0(Awp)/Ox,+ 1A ,dy/dx,—- AAW. 


23 A little more care ts required when both y,’s act on the same 
peice Define x=kiyi+Krx, and consider (k++ -x)+(x-+-R). 

xactly this term would arise if a system, acted on by potential x 
carrying momentum —k, fs disturbed by an added potential R of 
momentum +k (the reversed sign of the momenta in the Inter- 
mediate factors in the second term x-+~R has no effect since we 
will later integrate over all k), Hence as shown above the result is 
zero, but since (R++-x)+(x-++R) =hel(ves yi) — A(vK- + YK) 
we can still conclude (yK* > -yK) =Rhik (yi + + v4). 

%* The equations discussed in this section were deduced from the 
formylation of the Klein Gordon equation given in reference 5, 
Section 14. The function y in this section has only one component 
and is not a spinor. An alternative formal method of making the 
equations valid for spin zero and also for spin 1 is (presumably) 
by use of the Kemmer-Dulfin matrices fy, Satisfying the commu- 


tation relation 
BB Bot BoB By = Surhot derby. 


If we interpret @ to mean a,@,, rather than ayy,, for any ay, all 
of the equations in momentum space will remain formally identical 
to those for the spin 1/2; with the exception of those in which a 
denominator ss has been rationalized to (p-+un)(p?~m?)-t 
since p? is no longer equal to a number, p-p. But p? docs cqual 
(p+ p)p so that (p—m)"! may now he interpreted as (mp+m* 
+p p:p)(p-p—m?)1m-, Fhis implies that equations in Cco- 
ordinate space will be valid of the function K,(2, 1) is given as 
K4(2, )=LGVetm) — 7192+ OP) Ji (2, 1) with Va= Bud /9 ey, 
This is all in virtue of the fact that the many component wave 
function ¥ (S components for spin 0, 10 for spin 1) sulistics 
(iV —m)y = Ay which is formally identical to the Dirac Equation. 
See W. Pauli, Rev. Mod. Phys. 13, 203 (1940). 
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The important kernel is now /+(2, 1) defined in (1, Eq. 
(32)). For a free particle, the wave function y(2) satisfies 
+ Py—m’y=0. Ata point, 2, inside a space time region 
it is given by 


war= flyer. 1). 
~ (dp/ Ox) T4(2, DIN DEV, 


(as is readily shown by the usual method of demon- 
strating Green’s theorem) the integral being over an 
entire 3-surface boundary of the region (with normal 
vector V,). Only the positive frequency components of 
y contribute from the surface preceding the time corre- 
sponding to 2, and only negative frequencies from the 
surface future to 2. These can be interpreted as electrons 
and positrons in direct analogy to the Dirac case. 

The right-hand sicle of (35) can be considered as a 
source of new waves and a series of terms written down 
to represent matrix elements for processes of increasing 
order. There is only one new point here, the term in 
A,A, by which two quanta can act at the same time. 
As an example, suppose three quanta or potentials, 
dy exp(—iga+x), by exp(—igs-x), and ¢, exp(—ig.-x) are 
to act in that order ou a particle of original momentum 
pow So that pax PotQa and po= Pat Qo; tlre final mo- 
mentum being p-=potge. The matrix element is the 
sum of three terms (f= p,p,) (illustrated in Fig. 7) 


(pee pore) (pe — mm)" po-b+ pa: b) 

x (pe m®)"\(pa: a+ po: a) (36) 
— (pere+ prc) (pe— m*)-"(b-a) : 
— (6-6) (par—m") “(pa a+ pora). 


The first comes when each potential acts through the 
perturbation 10(A,W)/dx,+1A,d~/dx,. These gradient 
operators in momentum space mean respectively the 
momentum after and before the potential A, operates. 
The second term comes from 5, aud a, acting at the 
same instant and arises from the A,A, term In (a). 
Together 5, and a, carry momentum gota, So that 
after b+a operates the momentum is pot@atQe or po. 
The final term comes from ¢ and 4, operating together 
in a similar manner. The term A,A, thus permits a new 
type of process in which two quanta can be emitted (or 
absorbed, or one absorbed, one emitted) at the same 
time. There is no a+¢ term for the order a, b, ¢ we have 
assumed. Jn an actual problem there would be other 
terms like (36) but with alterations in the order in 
which the quanta a, 6, ¢ act. In these terms a-¢ would 
appear. 

As a further example the self-energy of a particle of 
momentum 9, is 


(e/2nim) { [029—B((P- R)2— m?)-1 
X(2p— bu Sup IPRROC (RB), 


where the 5,,=4 comes from the A,A, term and repre- 
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sents the possibility of the simultaneous emission and 
absorption of the same virtual quantum. This integral 
without the C(’) diverges quadratically and would not 
converge if C(k") = —)*/(k?— d*), Since the interaction 
occurs through the gradients of the potential, we must 
use a stronger convergence factor, for example C(R?’) 
= N(R), or in general (17) with /o-MG(A) dA =0. 
In this case the self-energy converges but depends 
quadratically on the cut-off \ and is not necessarily 
small compared to m. The radiative corrections to 
scattering after mass renormalization are insensitive to 
the cut-off just as for the Dirac equation. 

When there are several particles one can obtain Bose 
statistics by the rule that if two processes lead to the 
same state but with two electrons exchanged, their 
amplitudes are to be added (rather than subtracted as 
for Fermi statistics). In this case equivalence to the 
second quantization treatment of Pault and Weisskopf 
should be demonstrable in a way very much like that 
given in J (appendix) for Dirac electrons, The Bose 
Statistics mean that the sign of contribution of a closed 
loop to the vacuum polarization is the opposite of what 
it is for the Fermi case (see 1). It is (f:= pat @) 


2 
Wage f [Opie t Pend Prt av) Put — mi?) 
Qaim 
XK (b= n°)! = bapa — we)" 
— bul pe—m)-"Jd'po 
giving, 


é 1 A? ft 4ur—¢ 6 
Int = aso - In-—+4-—-—- —-- (: -— - | 
T 6 nm 9 3¢° taud 


the notation as in (33). The imaginary part for (g°)!> 2m 
is again positive representing the loss in the probability 
of finding the final state to be a vacuum, assoctated with 
the possibilities of pair production. Fermi. statistics 
would give a gain in probability (and also a charge 
renormalization of opposite sign to that expected). 


Fic. 7, Klein-Gordon particle in three porentials, Eq. (36). 
‘The coupling to the electromagnetic field is now, for example, 
po-d+ faa, anda new possibility arises, (b), of simultancous inter- 
action with two quanta @-b. The propagation factor is now 
(p: p—on?)-" for a particle of momentum py. 
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10. APPLICATION TO MESON THEORIES 


The theories which have been developed to describe 
mesons and the interaction of nucleons can be easily 
expressed in the language used here. Calculations, to 
lowest order in the interactions can be made very easily 
for the various theories, but agreement with experi- 
mental results is not obtained. Most likely all of our 
present formulations are quantitatively unsatisfactory. 
We shall content ourselves therefore with a brief sum- 
mary of the methods which can be used. 

The nucleons are usually assumed to satisfy Dirac’s 
equation so that the factor for propagation of a nucleon 
of momentum p is (S— M)~! where M is the mass of the 
nucleon (which tmplies that nucleons can be created in 
pairs), The nucleon is then assumed to interact with 
mesons, the various theories differing in the form as- 
sumed for this interaction. 

First, we consider the case of neutral mesons. The 
theory closest to electrodynamics is the theory of vector 
mesons with vector coupling. Here the factor for emis- 
sion or absorption of a meson is gy, when this meson is 
“polarized” in the w direction. The factor g, the 
“mesonic charge,” replaces the electric charge e. The 
amplitude for propagation of a meson of momentum q 
in intermediate states is (¢?— w?)-? (rather than-g~ as it 
is for light) where u is the mass of the meson. The neces- 
sary Integrals are made finite by convergence factors 
C(¢@?— u*) as in electrodynamics. For scalar mesons with 
scalar coupling the only change is that one replaces the 
yy by 1 in emission and absorption. There is no longer 
a direction of polarization, 4, to sum upon, For pseudo- 
scalar mesons, pseudoscalar coupling replace y, by 
vs=tyzVy¥evu For example, the self-energy matrix of 
a nucleon of momentum # in this theory is 


(@/ ni) [ral P— R= Mya BRE WYICHE 1), 


Other types of meson theory result from the replace- 
ment of y, by other expressions (for example by 
4(yuv— ¥7u) with a subsequent sum over all wand y 
for virtual mesons), Scalar mesons with vector coupling 
result from the replacement of y, by u~'g where g is the 
final momentum of the nucleon minus its initial mo- 
mentum, that is, it is the momentum of the meson if 
absorbed, or the negative of the momentum of a meson 
emitted. As is well known, this theory with neutral 
mesons gives zero for all processes, as is proved by our 
discussion on longitudinal waves in electrodynamics. 
Pseudoscalar mesons with pseudo-vector coupling corre- 
sponds to y, being replaced by u-ysq while vector 
mesons with tensor coupling correspond to using 
(Qu)-"(yu@—@7u). These extra gradients involve the 
danger of producing higher divergencies for real proc- 
esses. For example, ysq gives a logarithmically divergent 
interaction of neutron and electron.?® Although these 
divergencies can be held by strong enough convergence 


25 M, Slotnick and W. Heitler, Phys, Rev. 75, 1645 (1949). 
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factors, the results then are sensitive to the method used 
for convergence and the size of the cut-off values of A. 
For low order processes u~'ysq is equivalent to the 
pseudoscalar interaction 2M@u-'ys because if taken be- 
tween free particle wave functions of the nucleon of 
momenta p; and po= pit g, we have 


(Gaysquy) = (darys(Po— Pr) ti) = — (Uapoysitr) 
— (theyspity) = — 2M (tryst) 


since ys anticommutes with pz and p» operating on the 
state 2 equivalent to M as is p; on the state 1. This 
shows that the ys interaction is unusually weak in the 
non-relativistic limit (for example the expected value 
of ys for a free nucleon is zero), but since ys?=1 is not 
small, pseudoscalar theory gives a more important inter- 
action in second order than it does in first. Thus the 
pseudoscalar coupling constant should be chosen to fit 
nuclear forces including these important second order 
processes.” The equivalence of pseudoscalar and pseudo- 
vector coupling which holds for low order processes 
therefore does not hold when the pseudoscalar theory 
is giving its most important effects. These theories will 
therefore give quite different results in the majority of 
practical problems. 

In calculating the corrections to scattering of a nu- 
cleon by a neutral vector meson field (y,) due to the 
effects of virtual mesons, the situation is just as in 
electrodynamics, in that the result converges without 
need for a cut-off and depends only on gradients of the 
meson potential. With scalar (1) or pseudoscalar (ys) 
neutral mesons the result diverges logarithmically and 
so must be cut off. ‘Ihe part sensitive to the cut-off, 
however, is directly proportional to the meson poten- 
tial. It may thereby be removed by a renormalization 
of mesonic charge g. After this renormalization the re- 
sults depend only on gradients of the meson potential 
and are essentially independent of cut-off. This is in 
addition to the mesonic charge renormalization coming 
from the production of virtual nucleon pairs by a meson, 
analogous to the vacuum polarization in electro- 
dynamics. But here there is a further difference from 
electrodynamics for scalar or pseudoscalar mesons in 
that the polarization also gives a term in the induced 
current proportional to the meson potential representing 
therefore an additional renormalization of the mass of 
the meson which usually depends quadratically on the 
cut-off. 

Next consider charged mesons in the absence of an 
electromagnetic field. One can introduce isotopic spin 
operators in an obvious way. (Specifically replace the 
neutral ys, say, by 7,ys and sum over i=1, 2 where 
m= t4+7-, Te=1(74—7_) and 1, changes neutron to 
proton (7, on proton=0) and r_ changes proton to 
neutron.) It is just as easy for practical problems simply 
to keep track of whether the particle is a proton or a 
neutron on a diagram drawn to help write down the 


%H. A. Bethe, Bull. Am. Phys. Soc. 24, 3, Z3 (Washington, 
1949). 
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matrix element. This excludes certain processes. For 
example in the scattering of a negative meson from q, 
to go by a neutron, the meson g2 must be emitted first 
(in order of operators, not time) for the peutron cannot 
absorb the negative meson q, until it becomes a proton. 
That is,in comparison to the Klein Nishina formula (15), 
only the analogue of second term (see Trig. 5()) would 
appear in the scattering of negative mesons by nen- 
trons, and only the lirst term (lig. 5(a)) in the neutron 
scattering of positive mesons. 

The source of mesons of a given charge is not con- 
served, for a neutron capable of emitting negative me- 
sons imay (on einitting one, say) become a proton no 
longer able to do so. The proof that a perturbation q 
gives zero, discussed for longitudinal electromagnetic 
waves, fails. This has the consequence that vector me- 
sons, If represented by the interaction y, would not 
satisfy the condition that the divergence of the poten- 
tial is zero. The interaction is to be taken” as yu wy 
in emission and as y, in absorption if the real entission 
of mesons with a non-zero clivergence of potential is to 
be avoided. (The correction term u-*q,ug gives zero in 
the neutral case.) The asymmetry in emission and ab- 
sorption fs only apparent, as this is clearly the same 
thing as subtracting from the original yy:+*Yun, a term 
ug: +-+q. That is, if the term —u“g,g is ornitted the 
resulting theory describes a combination of mesons of 
spin one and spin zero, The spin zero mesons, coupled 
by vector coupling g, are removed by subtracting the 
term wg -+q. 

‘The two extra gradients g-+-g make the problem of 
diverging integrals still more serious (for example the 
interaction between two protons corresponding to the 
exchange of two charged vector mesons depends quad- 
ratically on the cut-off if calculated in a straightforward 
way). One is tempted in this formulation to choose 
simply yu-+*¥. and accept the admixture of spin zero 
mesons. But it appears that this leads in the conven- 
tional formalism to negative energtes for the spin zcro 
component. This shows one of the advantages of the 


27 The vector meson field potentials gy satisfy 
= O/Ox (A gy /Ox,— O—,/ OX) — wey = AMS yy 


where sy, the source for such mesons, is the matrix element of 
yu between states of neutron anel proton. By taking the clivergence 
d/ax, of both sides, conclude that d¢,/Ox,=4ry-7ds,/dx, so that 
the original equation can be rewritten as 
DP py ey = — desu tu? d/dxy(s,/3x,)). 

The right hand side gives in momentum representation yy 
—e quarry (be left vields the (g’— 2)" and finally the interaction 
Su¢u in the Lagrangian gives the y, on absorption. 

Proceeding in tbis way find generally that particles of spin one 
can be represented by a four-vector tty (which, for a free particle 
of momentum q satis(ies g'v=0). The propagation of virtual 
particles of momentum g fron) state v to w is represented by 
multiplication by the 4-4 matrix (or tensor) Puy= (Suv a gugr) 
x (@?— 2)". The first-order interaction (from the Praca equation) 
with an electromagnetic potential a exp(—tk-x) corresponds to 
multiplication by the matrix Eyy=(ga°a+q1+d)8p,— Gavdy— Quer 
where gi and ge=githk are the momenta before and after the 
interaction, Finally, two potentials a, 6 may act simultaneously, 
with matrix /’y,= — (a-6)8,,+d,ay. 
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method of second quantization of meson fields over the 
present formulation. There such errors of sign are obvi- 
ous while here we seem to be able to write seemingly 
inmocent expressions which can give absurd results. 
TPseudovector mesons with pscudovector cotpling corre- 
spond to using ¥s(yu.—#77gn@) for absorption and yey, 
for emission for both charged and neutral mesons. 

In the presence of an electromagnetic field, whenever 
the nucleon is a proton it interacts with the field in the 
way described for electrons. The meson interacts in the 
scalar or pseudoscalar case as a particle obeying the 
Klein-Gordon equation. It is important here to use the 
method of calculation of Bethe and Pauli, that is, a 
virtttal meson is assumed to have the same “mass”? dur- 
iny all its interactions with the electromagnetic field, 
The cesult for mass « and for (u2+ ’)! are subtracted 
and the difference integrated over the function G(A)dA. 
A separate Convergence factor is not provided for each 
meson propagation between electromagnetic interac- 
tions, otherwise gauge fuvariance fs not insured. When 
the coupling involves a gradient, such as y:g where g is 
the final minus the initial momentum of the nucleon, 
the vector potential A must be subtracted from the 
momentum of the proton. That is, there is an additional 
coupling +sA (plus when going from proton to neu- 
tron, minus for the reverse) representing the new possi- 
hility of a stmultaneous emission (or absorption) of 
meson and photon, 

Emission of positive or absorption of negative virtual 
mesons are represented in the same term, the sign of the 
charge being determined by temporal relations as for 
electrons and positrons. 

Calculations are very easily carried out in this way 
to lowest order in g? for the various theories for nucleou 
interaction, scattering of mesons by nucleons, meson 
production by nuclear collisions and by gamma-rays, 
nuclear magnetic motments, nentron electron scattering, 
etc., However, no good agreement with experiment re- 
sults, when these are available, is obtained. Probably 
all of the formulations are incorrect. An uncertainty 
arises since the calculations are only to first order in g’, 
and are not valid if g*/he is large. 

The author ts particularly indebted to Professor H. 
A. Bethe for his explanation of a method of obtaining 
finite and gauge invariant results for the problem of 
vacuum polarization, He is also grateful for Professor 
Bethe’s criticisms of the manuscript, and for innumer- 
able discussions during the development of this work. 
He wishes to thank Professor J. Ashkin for his careful 
reading of the manuscript. 


APPENDIX 


In this appendix a method will be iHustrated by which the 
simpler integrals appearing in problems in electrodynamics can 
he ¢lirectly evaluated. The integrals arising in more complex 
processes lead to rather complicated functions, but the study of 
the relations of one integral to another and their expression in 
terms of simpler integrals may be facilitated by the methods 
given here. 
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As a typical problem consider the integral (12) appearing in 
the first order radiationless scattering problem: 


Jf rem bom) “Fa(pi= Rm) tah RC(), (1a) 


where we shall take C(R%) to be typically —)*(R?—\%)-t and 
dtk means (2) “*dkidkedkudky. We first rationalize the factors 
(p—k—-m)7 = (p— k-+m)((p— k)?— m?)— obtaining, 


Sule RA m)a(pi— k++) yk72d RCC) 
x ((i~ Ry? m2)" ((pa— B® m2)“, (2a) 


The matrix expression may be simplified. I1 appears to be best to 
do so after the Integrations are performed, Since AB=2A-B—BA 
where 4:B= 4,8, is a number commuting with all matrices, find, 
if R is any expression, and A a vector, since yzA=— Ayut+2Ay, 


ypARYy.= —ArpRrp+2RA. (3a) 


Expressions between two y,’s can be thereby reduced by Induc- 
tion, Particularly useful are 


yuYu=4 
yuAyu= —2A (4a) 
1 ABy,=2(AB+ BA)=4A-B 
y,pABCyy=—2CBA 


where A, B, C are any three vector-matrices (i.e., linear com- 
binations of the four y’s). 

In order to calculate the integral in (2a) the integral may be 
written as the suin of three terms (since R= kevya), 


yulpot m)a(pit m) yuJi—Lrpvoa(pitm) yy 
tyulbatm)ayery Vat yuveyrryJa, (5a) 


where 


Joan) = ff (15 ei Role) Rd *RC(H8) 
X ((b2— R)?— m2)"1((p1— RB)? m3)“, (6a) 


That is for J; the (1; Re; Reky) is replaced by 1, for J2 by ke, and 
for Jz by keke. 

More complex processes of the first order involve more factors 
like ((o3— 2)? —»m?)~) and a corresponding increase in the number 
of k's which may appear in the numerator, as kekrky- ++. Higher 
order processes involving two or more virtual quanta involve 
similar integrals Lut with factors possibly involving k+k’ instead 
of just k, and the integral extending on k-2d4kC(R%) k’~4d4k’C(k2), 
‘They can be simplified by methods analogous to those used on 
the first order integrals, 

The factors (p)— k)*—m? may be written 


(p— k)*— m*= ht 2p-k-A, (7a) 


where A= m'?— p?, A, =m)?— p,3, etc., and we can consider dealing 
with cases of greater generality in that the different denominators 
need not have the same value of the mass m. In our specific prob- 
lem (6a). P:?=m? so that A, =0, but we desire to work with greater 
generality. 

Now for the factor C(k*)/R? we shall use —\?(R?—~d?)~1R-4, 
This can lie written as 


12 
—2/ (ROD?) Rt RAC R8) = — J, dl(R—L)* (8a) 


Thus we can replace k-?C(R?) by (R?-- L)~ and at the end inte- 
grate the result with respect to L from zero to \?, We can for 
many practical purposes consider A? very large relative to m? or p?. 
When the original integral converges even without the con- 
vergence factor, it will be obvious since the L integration will then 
be convergent to infinity, If an infra-red catastrophe exists in the 
integral one can simply assume quanta have a smal} mass Amin 
and extend the integral on L from min to %, rather than from 
zero to A¥ 
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We then have to do integrals of the form {Res Gites aifew 
SJ (15 be Rebs) dR (RE L)A(RB 2p kay ALR 


x (R?—2pe-k—Ae) 4, (9a) 


where by (1; &,; Rek,) we mean that in the place of this symbol 
either 1, or &e, or kek; may stand in different cases. In more 
complicated problems there may be more factors (k?—2p,-k—A,)7! 
or other powers of these factors (the (k?— L)~? may be considered 
as a special case of such a factor with p,=0, A;=Z) and further 
factors like kokykp-++ in the numerator. The poles in all the factors 
are made definite by the assumption that L, and the A’s have 
infinitesimal negative imaginary parts. 
We shall do the integrals of successive complexity by induction, 
We start with the simplest convergent one, and show 
, 
af dk (k?—L)-t= (BiL)71, (10a) 
For this integral is /(24)~%dkid’K(ke—K-K—L)~ where the 
vector K, of magnitude K=(K-K)} is &:, 2a, ks. The integral on 
ky shows third order poles at ky= +(K?+ L)t and ky= — (K?+L)}, 
Imagining, in accordance with our definitions, that ZL has a small 
negative imaginary part only the first is below the real axis. The 
contour can be closed by an infinite semi-circle below this axis, 
without change of the value of the integral since the contribution 
from the semi-circle vanishes in the limit. Thus the contour can 
be shrunk about the pole &g= +(K?+ L)} and the resulting &, inte- 
gral is —2wi times the residue at this pole. Writing ky = (K?+L)ite 
and expanding (ka? A?— L)~3= e"3(«4+2(K?4+L)))~ in powers of 
¢, the residue, being the coefficient of the term ¢~!, {fs seen to be 
6(2(.K?+ L)t)~5 so our Integral is 


~ (3i/32n) fe 4nK'd K(K2-4 L)-8? = (3/8i)(1/3L) 


establishing (10a), 
We also have fkad‘k(k?— L)~*=0 from the symmetry in the 
R space. We write these results as 


(Bi) f'(1; kel (HE L)9= (15 OL", (11a) 


where in the brackets (1; &¢) and (1; 0) corresponding entries are 
to be used. 
Substituting k= k’—p in (11a), and calling L—p?=A shows that 


(Bi) f' (15 held *k(h?—2p-k—-A)-1= (15 pe)(P+A)-“. (12a) 


By differentiating both sicles of (12a) with respect to A, or with 
respect to p, there follows directly 


(24i) {"(15 he; Robs)dse(RY— 2p -k—-A)~* 
=—(1} po} popr—Fbar(P?+A))(P?+4)-*%, (13a) 


Further differentiations give directly successive integrals in- 
cluding more & factors in the numerator and higher powers of 
(k?— 2p-k—A) in the denominator. 

The integrals so far only contain one factor in the denominator, 
To obtain results for two factors we make use of the identity 


antete {" dx(ax-+0(1—2))"%, (14a) 


(suggested by some work of Schwinger’s involving Gaussian inte- 
grals). Tbis represents the product of two reciprocals as a para- 
metric integral over one and will therefore permit integrals with 
two factors to be expressed in terms of one. For other powers of 
a, 6, we make use of all of the identities, such as 
ote tm J" dxdz(art b(1—2))-%, (15a) 
deducible from (14a) by successive differentiations with respect 
to a or 6 
To perform an integral, such as 


(Bi) f (1; be)d*k(RE—2py-k—Ai)*(HP— 2pa ks)", (16a) 
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write, using (15a), 


(H8— 2pr-k— Ay) A(R 2p ka) t= f" Ded e(H— 2p ek Os), 
where 


Pr=xpyt(l—x)p. and A,=xA,+(1—x)A2, (17a) 


(note that A, is not equal to m?—p,?) so that the expression (16a) 
is (81) fodxdx JS (1; ka)dthk(k®—2p,-k— Az)? which may now be 
evaluated by (12a) and is 


(16a)= fo" (1; pra)2ede(pst+ 82)", 


where p., Az are given in (17a). The integral in (18a) is elementary, 
being the integral of ratio of polynomials, the denominator of 
second degree in x. The general expression although readily ob- 
tained isa rather complicated combination of roots and logarithms. 

Other integrals can be obtained again by parametric differentia- 
tion. For example differentiation of (16a), (18a) with respect to 
As or por gives 


(an fa j Ro} Roky )d*h(R® — 2p +k — Ay) ?(R?—2pe-k— Ar)? 


= -f" (15 pro} profs — $er(Ps*+A.)) 
x 2x(1 —x)dx(p.2+Az), 


again leading to elementary integrals, 

As an example, consider the case that the second factor is just 
(k?—~L)-? and in the first put pi=p, Oi=A. Then p.=xp, 
42:=xA+(1—x)L. There results 


(81) (15 bes hokr)d‘e (Rt — 1.)~2(k2—2p-k—A)~? 


=f (15 xb05 #8 pePem Ber(2P+A2)) 
\2x(1—z)dx(2pt-4+,)-* (20a) 


Integrals with three factors can be reduced to those involving 
two by using (14a) again. They, therefore, lead to integrals with 
two parameters (e.g., see application to radiative correction to 
scattering below), 

The methods of calculation given in this paper are deceptively 
simple when applied to the lower order processes. For processes 
of increasingly higher orders the complexity and difficulty in- 
creases rapidly, and these methods soon become impractical in 
their present form. 


(18a) 


(19a) 


A. Self-Energy 
The self-energy integral (19) is 
(et/ mi) f4y(P— hm) Myyh*d RCC), (19) 


so that ft requires that we find (using the principte of (8a)) the 
integral on L from 0 to 2? of 


fue- hm) y,d4k(R2— L)-2(R2— 2p ky, 


since (p—k)*~—m'?= k?—2p-k, as p?=m?, This is of the form (16a) 
with di=L, pi=0, 4.=0, pr=p so that (18a) gives, since 
br=(1—-x)p, de=xl, 


(81) f(1; be)d*e( he — L)-*(RE— 2p) 
=f (3 (1—2x) pe) 2xdx((1 —x)?a®+4+aL)7, 
or performing the integral on ZL, as in (8), 
(81) f (1; ke)dteR-*C( RE) (RE 29-2) 
he qs x (L—x)4m?, 
=f (1; (1-2) pe)2dx sagen er ae 


Assuming now that \*>>m? we neglect (1—x)?m!* relative to 
xN in the argument of the logarithm, which then becomes 
(A8/m?)(2/(1—x)"), Then since SoldxIn(x(1—-x)*)=1 and 
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So(1—x)dx In (x(t — x) = — (1/4) find 
(8) fa; he) R2C( RI) d (22 -B-) 


(5 ay: (in™,-1)) 
= (21042; pe in-3))s 


so that substitution into (19) (after the (o>—k~m)7 in (19) is 
replaced by (6—k-+m)(k?—2p-k)-') gives 


(19) = (2/8)y.0(6+m)(2 In(X?/m?) + 2) 
— p(inQ?/m?)— 4) Fre 
= (2/8) [8m (in(A2/m?) + 1) —p(2 In(a?/m?)-+ 5)], 


using (4a) to remove the y,’s. This agrees with Eq. (20) of tbe text, 
and gives the self-energy (21) when p is replaced by m. 


(20) 


B. Corrections to Scattering 


The term (12) in the radiationless scattering, after rationalizing 
the matrix denominators and using p;?=p.?=m!* requires the 
integrals (9a), as we have discussed. This is an integral with 
three denominators which we do in two stages. First the factors 
(R?—2p;-k) and (k*—2p2-k) are combined by a parameter 9; 


(8 2p 8)M(RE— 2pakyt= f dy( FE 2py-B4, 
from (14a) where 


by = ypit(1— y)p2. (21a) 
We therefore need the integrals 
(81) f (1; hej Rokr de(RP—L)-*(R—2py ky, (22a 


which we will then integrate with respect to y from 0 to !, Next 
we do the integrals (22a) immediately from (20a) with p =p,, A=0: 


(22a)=— f° fs xPves 2 brabus 


—45oe(2p 2+ (1 —x)L))2x(1 —x)dx(xip,t+ L(1 —x)) Ady, 
We now turn to the integrals on L as required in (8a), The first 
term, (1), in (159; Rekr) gives no trouble for large L, but if L 
is put equal to zero there results x~*py~? which leads to a diverging 
integral on x as x0, This infra-red catastrophe is analyzed by 
using Amin? for the lower limit of the 1 integral. For the last term 
the upper limit of Z must be kept as \?, Assuming pin? py! 
the x integrals which remain are trivial, as in the self-energy case, 
One finds 
~ (88) f HA rmmia?) “Md RC(RE— dina!) RE 2p RYH RE 29908) 
1 
=f py-tdy in(2,4/rnac?) (23a) 
~ (81), f bak” 2d*RC(RE (HE 2p BYU Dp bY! 
rl 
=2f PyoPy dy, (24a) 
~ (Bi) f baker Pr*dRC (HE) (RE 2p RYE 2p BH 
1 r 
= J" prebvebidy— Bor J 49 10049) + Voor. (25a) 
The integrals on y give, 


f Py tdy InP Amin~2) = 4(m? sin2o)~[9 in(dmiv) 
6 
-f a tanada|, (26a) 


f Pyoby*dy = 0(m?* sin26)“"(piet+ pro), (27a) 


a 
Sy Pveb urby*dy=0(2m? sin26)""(pret pir) (bret br) 
+9 Geogr (1-6 ctnd), 


f dy In(d2p,-?) =1n(A2/m?) 4+ 2(1—~ 6 ctné), 


(28a) 
(29a) 
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These integrals on y were performed as follows, Since pro=pitg¢ 
where g is the innmentum carried by the potential, it follows from 
pi=pP=or that 26.-¢=—q? so that since p,=p:tq(t—»), 
pr=wPt—@y(t—y). The substitution 2y—1=tane/tan@ where & 
is defined by 41? sin2@ = g? is useful for it means p,?= 0? sec?a/sec?d 
and py~dy = (m1? sin2@) Ida where @ goes from —8 to +8. 

These results are substituted into the original scattering formula 
(2a), giving (22), It has been simplificd by frequent use of the 
fact that ~) Operating on the initial state is m, and likewise p2 
when it appears at the ieft is replacable by m. (Thus, to simplify: 


Ypheapiyy = — 2piapr by (4a), 
=~ 2(b.- ga(pitg) =—2(m—g)a(m+q). 
A term like gag = — ga+42(u-g)q is equivalent to just —g?a since 
g=p.r— fp; =m—m has zero matrix element.) The renormalization 
term requires the corresponding integrals for the special case 
q=0, 


C. Vacuum Polarization 


‘The expressions (32) and (32') fer Jy, in the vacuum polariza- 
tion problem require the calculation of the integral 


Iwlntya = [Sete Sqtm)y(pt+iqtm) jdip 
X(—- bay? — my (P+hg)?— m2)", (32) 


where we have replaced p by p—3qg to simplify the calculation 
somewhat. We shall indicate the method of calculation by studying 
the integral, 


I(m®) = f pebrd\p((O— 4a)? my \((D-+4q)*— my 


The factors in the denominator, p?— p-g—m’4+}¢@ and p+ p-g 
—n?+ tq? are combined as usual by (8a) but for symmetry we 
substitute x=4(t-+n), (L—x)=4(1—n) and integrate » from 
-lto +t: 
+1 

10m?) =f pabed'9(B?= np-q— m+ 4q)%dn/2. (30a) 
But the integral on p will not be faund in our list for it is badly 
divergent. However, as discussed in Section 7, Eq. (32’) we do not 
wish /(m?) but rather Jj> CH (m2) — 1 (mit?) JG(d) dd. We can 
calculate the difference /(m*)— /(m?+ 2) by first calculating the 
derivative /(m?+L) of J with respect to »? at m?+ZL and falter 
integrating L from zero to \%, By differentiating (30a), with 
respect to #? find, 


+1 
Pm-+L)= J pebedp(0?—np-q—m?— L+ igyMn, 
This still diverges, but we can differentiate again to get 
+1 
I'(m+ 1) =3,f" pebed'p(Dt—np-g—mt— L-+ ig?) “tdn 
on (3ta) 
== (81 f™ (tntgod¢D*— YbeeD-)dn 
(where D=}(n?— t)g?-+4 ?+ L), which now converges and has been 
evaluated by (13a) with p=4nq and 4=m?+ L—tq?. Now to get 
I’ we may integrate /” with respect to Z as an indefinite integral 
and we may choose any convenient arbitrary constant. This is because 
a constant C in /’ will mean a term — C2? in (us?) — [(sn?4-d?) 
which vanishes since we will integrate the results times G(A)dv 
anil fu”’G(A)dA =0. This micans that the logarithm appearing on 
integrating Zin (31a) presents no problem. We may take 
M +H 
P (mL) = (88) f),' Ubrtaege D+ der InD]dn+ Cocr, 


a subsequent integra! on L and finally on » presents no new 
problems. There results 


~ (81) f pot ed*p((b— 44) my-"((D+ 4g)? mej" 


we atl (id on] 
(age — Borg? fs “3g? VY tane Fa ae 


+ bor0 (Mm )in( M4 1 — CAM], (32a) 
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where we assume X>>m? and have put seme terms into the arbi- 
trary constant C’ which is independent of ¥ (but itt principle cout 
depend en g?) and which drops out in the integra! on G(A)dA. We 
have set @= 47 sin?e, 

In a very siinilar way the integra! with m? in the numerator can 
be worked out. It is, of course, necessary to ditferentiate this »* 
also when calculating /’ and /’’. There results 


— (81) f m'a*p((p— 4q)?— my 1((P-+ 49)? = mt 
= 4m%(1~ 6 ctnd) — 42/3 +2(M-Emtyin(xtwr?-41)—C/), (33a) 


with another unimportant constant C’. The complete problem re- 
quires the further integral, 


~ (81) f (1; pad p((P— Ba)? mt)“K (pt day? mt)! 
= (1, 0)(4(1-6 ctn®)42 In(2mr2)), (34a) 


The yalue of the integral (34a) times m? differs from (33a), of 
course, because the results on the right are not actually the inte- 
grals on the Jeft, but rather eqnal tbeir actual value miuus their 
valuc for 1? = on? d2, 

Combining these quantilics, as required by (32), dropping the 
constams C’, C” and evatuating the spur gives (33). The spurs are 
evaluated in the usual way, noting that the spur of any odd 
number of y matrices vanishes and Sp{.tB)=Sp(B.t) for arbi- 
trary .t, B. The Spt) =4 and we also have 


1S pl(pit- mi) (po— m2) ] = pa p2— ima, (35a) 
AS pC(Dit my) (P2— ma) (Dat ms) (Da ma) ] 
= (pi: pa— muta) (pa* Pa meaty) 
— (ps: pa myina) (pov pa maina) 
+ (prt pa mys) (poe Pa~ mam), (36a) 


where p,, m, are arbitrary four-vectors and constants, 

It is interesting that the terms of order \? Ind? go out, so that 
the charge renormalization depends only lagarithmically on A% 
This is not true for some of the meson thevries. Electrodynamics 
is suspiciously unique in the mildness of its divergence. 


D. More Complex Problems 


Matrix elements for complex pratidems can be set up in a 
manner analogous to that used for the simpler cases. We give 
three illustrations; higher order corrections to (he Moller scatter- 


-< )- 
- )- 


Fic. 8 The interaction between two en ty order (ethers 
One adds the contribution of every figure involving two virtual 
quanta, Appendix D. 
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ing, to the Compton scattering, and the interaction of a neutron 
with an electromagnetic field. 

Por the Mgiler scattering, consider two electrons, one in state 
x, of momentum py and the other in state #2 of momentum po. 
Laler they are found in states is, ps3 and u4, ps. This may happen 
(irsi order in ¢#/fc) because they exchange a quantinn of momen- 
tum @= pi~ P)=Ps— 2 in the manner of Iq. (4) and Fig. 1. The 
matrix cleinent for this process is proportional to (translating (4) 
to momentum space) 


(tigy ptt) (taypinyg?. (37a) 


We shal! discuss corrections to (37a) to the next order in ¢?/he. 
(There is also the possibility that it is the electron at 2 which 
finally arrives at 3, the electran at t going to 4 through the ex- 
change of quantum of momentum p3— ps. The amplitude for this 
process, Chay yth) (ayy) (Pa— p22, must be subtracted from 
(37a) in accordance with the exclusion principle. A similar situa- 
tion exists to cach order so that we need consider in detail only 
the corrections to (37a), reserving to the last the subtraction of 
the same terms with 3, 4 exchanged.) 

One reason that (37a) is modified is that two quanta may be 
exchanged, in the manner of Fig. 8a, The total matrix clement 
for all cachanges of this type is 


(et/ mt) f Ceayo(Bi— R= I) pyre) (tie Prt R= my 3) 
Ag ky dk, (38a) 


as is clear from the figure and the general rule that electrons of 
momentum p contribute in amplitude (6—m)7! between inter- 
actions y,, and that quanta of momentum & contribute &7?, In 
integrating on d'& and summing over # and vy, we add all alterna- 
lives of the tyyte of Fig, 8a. 1f the time of absorption, yy, of the 
quantum & by electron 2 is later than the absorption, y,, of g— &, 
this corresponds to the virtual state po-+R being a positron (so 
that (38a) contains over thirty terms of the conventional method 
of analysis). 

In integrating over all these alternatives we have considered all 
possible distortions of Fig. 8a which preserve the order of events 
along the trajectories. We have not included the possibilities 
corresponding to Fig, 8b, however, Their contrihution is 


(e¢/ wi) f (tisys(Dr~ R= m)-*y uit) 
X (ig p(betg— k—m)y,1,) RU g— Rk) d'k, (39a) 


as is readily verified by labeling the diagram. The contributions af 
all possible way's that an event can accur are ta be added. This 


9. h. 


Vic, 9. Rarliative correction to the Comptan scattering term 
(a) of Fig. 5. Appendix D, 


R. P. FEYNMAN 


means that one fdds with equal weight the integrals corresponding 
ty cach topologically distinct figure. 

To this same order there are also the possibilities of Fig. 8d 
which give 


(@/ni f (isn(br- k—m)™y,(pi— R— mM)“ ys) 
X CtgyptR Agrd te. 


This integral on & will be seen to be precisely the integrat (12) for 
the radiative corrections to scattering, which we have warkeil ant. 
The term may be conthined with the renormalization terms result- 
ing from the difference af the effects of mass change and the terms, 
Figs. 8f and 8g. Figures 8c, 8h, and 8i are similarly analyzed. 

Finally the term Fig. 8c is clearly related to our vacuum 
polarization problem, and when inlegrated gives a lerm propor: 
tional to (iyypnz) Uisy ta) J go). Tf the charge ts renormalized the 
term InQ&/m) in Jy, in (33) is omitted so there is ne remaining 
dependence on the cut-off. 

The only new integrals we require are the convergent infegrals 
(38a) and (39a). They can be simplitted hy rationalizing the de- 
nominators and combining them by (ida). Kar example (38a) ine 
volves the factors (R?—2pi-ky RP Qpa kh) UR Ug? + RI Ig ky 2, 
The first two may be combined by (14a) will a parameter y, and 
the second pair by an expression obtained by dilferentiatian (ESa) 
with respect to 6 and calling the parameter y, There resulls a 
factor (R°~ 2p, kh) 2(RP+ yq?— 2yg-k)7! so that the integrals on 
d‘k now involve two factors and can be performed by the methods 
given carlier in the appendix. The subsequent integrals ou the 
parameters x and y are conytlicated and have nat been warked out 
in detail, 

Working with charged inesons there ts often a considerable re- 
duction of the number of terms. For example, for the interachon 
between pratons resulting from the exchange of two mesons only 
the term carresponding to Fig, 8b remains, Verm &a, for cxample, 
is impossible, for if the firs! proton emits a positive meso the 
second cannot alisorl it directly for only neutrons can absorls 
positive mesons. 

As a second example, consider the radiative correction lo the 
Compton scattering. As seen from Eq, (15) and Vig. 3 this scatter- 
ing is represented hy two terms, so that we can consider the car- 
rections to each one separately. Figure 9 shows the types of terms 
arising from corrections to the term of Fig, Sa. Calling & the 
momentum of the virtual quantum, Vig. 9a gives an integral 


fre k—m) eo pit qm k—m) ey (pi— R— Mm) yp Rd, 


convergent without cut-aff and reducible by the methods outlined 
in this appendix, 

The other terms are relatively easy: to evatoate. Terms & and ¢ 
of Fig. 9 are closcly related ta radiative corrections (although 
somewhat mare difficult to evaluate, for ane of the states js not 
that af a free electron, (pi+¢q)?¥ m7"). Terms ¢, f are renormaliza- 
lion terms, From term d must be subtracted explicitly the effect 
of mass Am, as analyzect in Eqs, (26) and (27) leading to (28) 
with p’=pitq, @=e2, b=e.. Terms g, ft give zero since the 
vacuum polarization has zero effect on free light quanta, q?=0, 
q2=0, the total is insensitive to the cut-off A. 

The result shaws an infra-red catastrophe, the largest part 
of the effect, When cut-off at Ayan, the eficel proportianal to 
InQt/ Armin) GOES as 


(2/m) In(at/Anun) (1 28 ctn26), (40a) 


times the uncorrected amplitude, where (P2—pi)?= 4? sin?@, This 
is the same as for the radiative carrection to scattering for a 
deflection P»—pi. This is pthysicatly clear since the long wave 
quanta are not effected by short-lived intermediate states. The 
infra-red effects arise® from a final adjustment of the field from 
the asymptotic coulamb field characteristic of the elrciron of 


% 1, Bloch and A. Nordsieck, Phys, Rev. 52, 54 (1937). 
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momentum f; before the collision to that characteristic of an 
electron moving in a new direction f» after the collision. 

The complete expression for the correction is a very complicated 
expression involving transcendental integrals. 

Asa final example we consider the interaction of a neutron with 
an electromagnetic field in virtue of the fact that the neutron may 
emit a virtual negative meson. We choose the example of pseudo- 
scalar mesons with pseudovector coupling. The change in ampli- 
tude due to an electromagnetic field A= a exp(—1g-x) determines 
the scattering of a neutron by such a ficld. In the limit of small ¢g 
it will vary as ga—ag which represents the interaction of a par- 
ticle possessing a magnetic moment. The first-order interaction 
between an electron and a neutron is given by: the same calculation 
by considering the exchange of a quantum between the electron 
and the nucleon. In this case a, is q~? times the matrix element of 
yp» between the initial and final states of the electron, the states 
differing in momentum by g, 

The interaction may occur because the neutron of momentum 
p: emits a negative meson becoming a proton which proton inter- 
acts with the field and then reabsorbs the meson (Fig. 10a). The 
matrix for this process is (pb2=~:+ 4), 


fcr (rk Mya, h—- My (yok) (R= uy td'k. (41a) 


Alternatively it may be the meson which interacts with the field, 
We assume that it does this in the manner of a scalar potential 
satisfying the Klein Gordon Eq, (35), (Fig. 10b) 


— f yates) (Di~ ba My My she) (Be? a) 

(Re a+ ky a)(Ri?— u?)“'d4ki, (42a) 
where we have put k2=&i+g. The change in sign arises because 
the virtual meson is negative. Finally there are two terms arising 
from the ys@ part of the pseudovector coupling (Figs. 10c, 10d) 


Sow (p2— kh M)“'(7s0)( RP w2)“'d sh, (43a) 


and 


S10 (bik Bad) adh. (Ata) 
Using convergence factors in the manner discussed in the section 
on meson theories each integra! can be evaluated and the results 
combined. Expanded in powers of g the first term gives the mag- 
netic moment of the neutron and is insensitive to the cut-off, the 
next gives the scattering amplitude of slow electrons on neutrons, 
and depends logarithmically on the cut-off. 

The expressions may be simplified and combined somewhat 
before integration, This makes the integrals a little easier and also 
shows the relation to the case of pseudoscalar coupling. For 
example in (41a) the final ys& can be written as ys(k—fi1+.¥4) 
since pi=4f when operating on the initial neutron state. This is 
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Fic. 10. According to the meson theory a neutron interacts with 
an electromagnetic potential @ by first emitting a virtual charged 
meson. The figure illustrates the case for a pseudoscalar meson 
with pseudovector coupling. Appendix D, 


(pbi- k- M)v6+2M 7, since ys anticommutes with p; and k. The 
first term cancels the (p:-&—M)~ and gives a term which just 
cancels (43a). In a like manner the leading factor ys& in (41a) is 
written as — 24fys—-ys(p2— k~M), the second term leading to a 
simpler term containing no (f2—-k—M)7! factor and combining 
with a similar one from (44a). One simplifies the yski and ysk2 
In (42a) in an anatogous way. There finally results terms like 
(41a), (42a) but with pseudoscalar coupling 24fys instead of 
sk, no terms like (43a) or (44a) and a remainder, representing 
the difference in effects of pseudovector and pseudoscalar coupling, 
The pseudoscalar terms do not depend sensitively on the cut-off, 
but the difference term depends on it logarithmically. The differ- 
ence term affects the electron-neutron interaction but not the 
magnetic moment of the neutron. 

Interaction of a proton with an electromagnetic potentia! can 
be similarly analyzed. There is an effect of virtual mesons on the 
electromagnetic properties of the proton even in the case that the 
mesons are neutral, It is analogous to the radiative corrections to 
the scattering of electrons due to virtual photons. The sum of the 
magnetic moments of neutron and proton for charged mesons is 
the same as the proton moment calculated for the corresponding 
neutral mesons. In fact it is readily seen by comparing diagrams, 
that for arbitrary g, the scattering matrix to first order in the 
electromagnetic potential for a proton according to neutral meson 
theory is equal, if the mesons were charged, to the sum of the 
matrix for a neutron and the matrix for a proton. This is true, for 
any type or mixtures of meson Coupling, to ali orders in the 
coupling (neglecting the mass difference of neutron and proton). 
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Corrections of order e® to the differential cross section for Compton scattering of unpolarized radiation 
by electrons are computed. The results for corrections ascribable to virtual photons are finite, relativistically 
invariant, and valid at all energies, but contain a term which depends logarithmically on an assumed small 
photon mass . A cross section of the same order has also been obtained for double Compton scattering in 
which one of the emitted photons has an energy small compared to the rest mass of the electron (with the 
electron initially at rest). This contains a term depending on Ind which exactly compensates the similar 
term arising from virtual quanta in all observable cases. Approximations for low and high energies, as 
well as numerical results, are given. These disagree with results obtained previously by Schafroth. 


HE object of this paper is to obtain the correction 
to the differential cross section for Compton 
scattering (Klein-Nishina formula) arising from the 
possibility that the electron may emit and reabsorb a 
virtual photon in connection with the scattering process. 
We shall apply the methods developed by one of us! to 
ubtain an explicit cross section to order e® for unpolar- 
ized radiation, valid (in so far as the theory is valid) at 
all energies. 

Previous workers have shown that the high frequency 
divergences which enter in the straightforward appli- 
cation of perturbation theory to this problem can be 
removed by charge and mass renormalization. Schaf- 
roth’ has obtained a finite e-order matrix element in 
relativistic and gauge invariant form. He also showed, 
following the treatment of the analogous problem for 
scalar particles by Corinaldesi and Jost,‘ that the 
infrared divergence which occurs can be removed by 
addition of the double Compton cross section in which 
the incoming photon produces two photons on inter- 
acting with the electron, and he made explicit evalua- 
tion of the cross section (but not of the double scat- 
tering) in the nonrelativistic and extreme relativistic 
approximations. His results, however, disagree with 
ours in both limits. 

Since the interpretation of any experiment to measure 
the radiative corrections requires a knowledge of the 
double Compton cross section, we have computed this 
also, for the case that one of the emitted photons has 
an energy in the laboratory system which is small 
compared to the electron rest energy. 

After a brief introduction, we shall in Sec. II write 
down and discuss the matrix element for the corrections. 


is Now at Northwestern University. Part of this work was done 
In partial fulfillment of the requirements for the Ph.D, degree at 
Cornell University. 
{Now on leave at Centro Brasileiro de Pesquisas Fisicas, 
Ris oe andi, Brazil. 
. P, Feynman, Phys, Rev. 76, 749 (1949); and Phys. Rev. 
76, 769 (1949). as . 
iM. R. Schafroth, Hely, Phys, Acta 22, 501 (1949). 
; M. R. Schafroth, Helv. Phys. Acta 23, 542 (1950). 
E, Corinaidesi and R, Jost, Helv. Phys. Acta 21, 183 (1948). 


Section III will detail the evaluation of the differential 
cross section. Section IV will be concerned with the 
infrared catastrophe and the double Compton effect. 
Sections V and VI will discuss limiting cases and some 
numerical results. Mathematical details will be reserved 
for the appendices. 

The method of calculating this effect is given by 
Feynman,® and for brevity we will not repeat the 
discussion here but will simply carry out the explicit 
evaluation of the matrix elements involved. Our nota- 
tion is that of reference 1. 

Some improvement has been made in the method of 
computing matrix elements given in reference 1(b). 
This is described here in detail in Appendix Y. 


I, THE KLEIN-NISHINA FORMULA FOR 
UNPOLARIZED RADIATION 


The direct Compton effect, in which a photon of 
momentum 4, polarization e,, impinges on an electron 
of initial momentum /;, to be scattered as a new photon 
of momentum q, polarization e2, is represented by a 
matrix element 

W=R+4+5 (fa) 
with 
R=elprtgi—m)te,, S=e(pi—q:—m) tea, (1b) 

The final momentum of the electron is, of course, 

po=Pit+gi—qe. The terms correspond to the diagrams 


of Fig. 1. 
We shall call 


Ps=PrtN=Potg, Pr=Pi-“V=hr—-H, (2) 
and define the important invariants x, 7 by 
m= m>— pe=—2p1-qi= — 2p2- qe, 
(3) 


m= m— 


P= 2p Qo=2pod. 


In the laboratory system, with w: and we the energies 
of the incoming and outgoing photons, « is —2w,/m 
and 7 is 2w2/m. In terms of the quantities defined in 


5 This problem is discussed in reference 1, Appendix D, p. 788. 
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out by the interchange of e, and é2, of g, and — qo, and 
of pz; and p,. In the final result this means simply an 
interchange of x and 7. Hence we need study only R, 
the S terms being obtained from the R terms immedi. 


Terms N’ and N” give zero since there are no 
vacuum Polarization effects for free photons. 
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(2) and (3) we have 
mxkR= — e2(p3-+m)es, mS = — e1(patm)es. (4) 
The differential cross section for the final photon to 
go into solid angle dQ, if the initial electron is at rest 
(laboratory system, ~1= ry.) is 
do=CdQ(w2/w?)F (S) 
where F is the square of the matrix element of W(1a), 


F=|(W),|’. (6) 
If we are uninterested in the spin states of the elec- 
tron, F may be replaced by (2m?)-'U where 
U=4 Spl (bot m)W (p.t-m)W]. (7) 
If, in addition, unpolarized radiation is used and the 
sum over polarization directions is required, e, can be 
replaced by y. and é@2 by yg in the spur and half the 
sum over a, 8 taken (reference 1(b), Sec. 8). Then the 
term in (7) which is second order in R is 
}(2m?x)—* Spl (bot m)ya(bst-m)ya(pitm) Ya 
X (bst-m)ypJ=4/n?—1/x—2/x. (8) 
The reduction can be accomplished by Eqs. (4a) and 
(36a) of reference 1(b). The term of second order in S 
is (8) with x, r interchanged, since S is obtained from 
R by replacing ps3 by p, after the average is taken on 
photon polarization. The cross term is 
(2m?x)-"(2m?r)-* Spl (bot m) ve(bstm) Ya 
X (Pitm)ya(bst m) a l= 8/Kr— 2/7—2/k. 
The sum gives for U=m?D spin > poi| (W),|*: 
U=4(x4 rt — 4 (ao) — (e/t4+-7/x) (10) 


and for the Klein-Nishina formula in terms of x, 7 we 


have: 
2res/ 7 \ fdr dx 
Pie (8 | eae 
mr eP7\P 
In the laboratory system, in view of x= —2w,/m, 
1T=2we/m and the Compton relation 


(9) 


(11) 


w12(1—cosg) = m(w1— we), (12) 
(11) can be written in the usual way 
do = (e4/2m?)dQ(w2?/w 2) (w:/wet wo/wi—sin?g). (13) 


Il. THE e+-ORDER MATRIX ELEMENT 


The diagrams of the first radiative corrections to term 
R of the Compton effect are given in Fig. 2. (See 
reference 1(b), Fig. 9.) The terms containing the 
analogous modifications of S can be obtained through- 


Terms M’ and M” together give a factor® r/2i times 
R, where 


r=|n(A/m)+9/4—2 In(m/d) (14) 


as shown in reference 1(b), Sec. 6. The quantity A isa 
temporary high frequency cutoff, introduced so that 
each diagram can be separately evaluated. The final 
result will become independent of A as A. The 
“infrared catastrophe” discussed in Sec. IV is treated, 
at this point, by assuming the photons to have a small 
rest mass ). 
The term L is 


ge f ex(bs—m)—"y4(ps— km)“ 


Xyulps—m)erkd4eC(k?). (15) 
From this must be subtracted the mass correction for 
an electron travelling between the absorption and 
emission of the virtual quantum. Since (to order Am) 


(p—m—Am)"= (p=) (P= mm (p—m)~, 


this gives just the expression for Z except that Am 
replaces 


f yu(Pa— k—m)*y REC (2), 


where Am is the mass correction for cutoff A [reference 
1(b), Eq. (21)]: 


Am=im[$+4 In(A/m)]. (16) 


Since this diagram occurs for problems other than the 
one we consider here, we give the result in a general 
way. Each (p—m)-' propagation factor has, as a 
consequence of diagrams like L, a correction to the 
first order in e given by 


(p—m)— | yu(p— R—m)"'y,RdtRC(R’) (p—m) 
—Am(p—m)~? 
= 4s (pm) n(a*/m 
5 4  4(2—n) 
+--—— Inn 
2 y—-1 (y—-1) 


—m(p~ m4 -7 im}, ey) 
n—1 (y—1)? 


where m?n=m?— p’. 


6 The factor obtained in reference 1(b) is — (€/2m)r, but we have 
reserved a factor e/zi for later inclusion, 
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Terms K‘ and K” again possess a feature common to 
several! problems, and we will therefore first discuss it 
in a general way. In all problems in which an electron 
interacts with a potential or a free or virtual photon 
there will be a piece of the diagram like Fig. 3. That is, 
there will be a partial factor in one of the matrix 
elements: 


r= f y(b+q—k—m)-!e(p—k—m)-! 


XryukdtkC(k?). (18) 


It would be most convenient to have this evaluated in 
the general case of arbitrary p and g. However, we 
have evaluated it only in the special case that g?=0, 
p’=m?*, with the matrix operating on a state u such 
that pu=mu, Calling m’x=—2p-q it is (Appendix Z): 


1 
8iT=4u-Lmbe-t 2x-H(e-p)q] f in(1—v)dv/s 
i-« 


+2[(2m’*-+ pq—gp)e 
+ 2n-"(e-p) (q+ mx)(3x—2)(x— 1)“ ](x— 1) Ink 
+[2 In(m?/A*)—1 me 
—4(e-p)L(q+m)(x—1) +e]. (19) 
If the final, rather than the initial, state is a free 
electron, the matrix required is 7’, so the result is 
obtained directly from (19). For term K’, this T for 
the case P= ~1, g=Q1, €=€1 is to be multiplied on the 
left by e2(fi+qi—m)—'=e2(p3—m)—. Therefore, K’ 
und the corresponding term K” together give 
K=K’+K"= (81) “"Lee(bs—m) “T (hi, 1, 1) 
+TP'(b2, qo, €2)(pbs—m)*e,]. 


If we now examine the coefficients of the term 
In(m?/A®) in K, L, and M, that is in (20), (17), and 


(20) 


Fic, 2, Corrections to term R of Compton scattering. 
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(14), we observe that K gives (4m?/8i)R, L gives 
(—2m?/8i)R, and M gives (—2m?/8i)R. Therefore the 
terms dependent on A vanish. Since we shall find that 
the J integral is finite without cutoff, we note that the 
complete result is insensitive to A. 

The term J is given by 


ie f Ae er ae ae 


Xei(Pi—k—-m)'y,R-dyk. (21) 


For large & the factors in the integrand vary as k-* 
with »=5 and the integration over k-space therefore 
converges. If we had included the convergence factor 
C(kR’), the result would be independent of A as A>, 

When the reciprocals are rationalized (eg., 
(b2—k—m)—“= (po—k+ m):[(po— k)?—m?]), powers 
of k, up to the third appear in the numerator of the 
integrand. Therefore we shall have to evaluate integrals 
of the form: 


Jo; oyoryorp) 


5 fa ; ke; kk; kokeky) (po—kyY— mJ 
XO (p3—k)Y— mL (pi — 2) da 


That is, for Jo the factor (1; ke; ++* etc.) is replaced 
by unity, for J, by k., for Jz, by kok,, and for Jory by 
k,k-ky. The manner in which J can be expressed in 
terms of these integrals is illustrated, for the case of 
matrix T in Appendix Z. 

The J integrals can be worked out by the parametric 
methods described in reference 1(b) (Appendix). They 
involve integrals having four factors in the denominator 
and will lead, therefore, to integrals over three param- 
eters. (Jo is integrated in this manner in Appendix Y.) 
Generally these are very difficult to evaluate, although 
Jo is particularly simple. This fact makes it possible 
to circumvent some of the difficulties of J,, J,,, and 
Jorn 

It is possible to express these other J integrals as 
linear combinations of the integral Jo and of other 
integrals, all of which involve only three quadratic 
factors in the denominator. These latter, in parametric 
form, require only two parameters (and are much more 
easily evaluated than a direct attack on J,, say, would 


(22) 
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indicate). This technique is useful in other problems 
also’ and is described in detail in Appendix Y. 


III. CROSS SECTION FOR UNPOLARIZED LIGHT 


If we call the sum J+K+24+M=R, then 
(2/xi)R™ will be the correction to the matrix R of the 
direct effect (1). If the corresponding correction to the 
term S is called (e2/2i)S“, the corrected matrix for the 
Compton effect is 


W'=W-+ (2/71) WY = R+ S+ (2/mi)(RO+4+S), (23) 


The absolute square of the matrix element of W’, taken 
between the initial and final electron states, gives the 
probability of transition correct to one order in ée higher 
than (6). We shall calculate in this paper only the cross 
section averaged over spin directions of the electron 
and polarization directions of the photons. 

We need the spur: 


i Spl(Pe+m)W'(pit-m)W"] 


as in (7). Considering terms up to the first order in e? 
(which are all that are valid), (24) is 


—3{(2/ni) Spl (bo+-m)W (pi+m)W] 
—(2/ni) Spl (pot+m)W(p,4+m)W)}. (25) 


In evaluating (25) for unpolarized light we have re- 
placed e; by y. and é: by yg and taken one-half of the 
resulting sum as discussed in connection with (8). 
Some algebraic details are discussed in Appendix Z.® 

The last two spurs in (25) are complex conjugates, 
so that the correction to U is —e/z times the real part 
of 


(24) 


UW = — (41) Sp[(po-+m)W (pit-m)W], (26) 
That is, U is to be replaced in (11) by 
U’=U—(2/x)R.P.UM, (27) 


If we let 


P(x, 7) = — (4) Sp[(po+m)R(p,+-m)W] (28) 
then 


U®= P(x, 7) 4+ P(r, «) (29) 


since the S@ diagrams are obtained from the R” 
diagrams (for unpolarized light) by the interchange of 
p3 and p, and of g; and —ge; hence the final result, 
simply by interchange of « and 7. 


Tt has been applied by G, R. Lomanitz to completely evaluate 
the e® corrections to the Miller scattering cross section of electrons 
in his thesis Second Order Effects in the Electron-Electron Inter- 
action, Cornell, 1950. Again, in the problem of scattering of light 
by light, the integral with unit numerator is easily done, and the 
other integrals can be reduced to it and simpler integrals alge- 
braically. But here the algebraic complexity makes the problem 
extremely tedious. 

8 In actual evaluation it was found easier to take the spur first 
and perform the integrals later. Thus, in place of the expression 
T (Eq. 18), the expression 7 (Eq. A41) was substituted and the 
values of the integrals from Appendix X substituted after taking 
the spur. This has the advantage that some of the integrals do 
not appear, or appear only in simpler combinations, 
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The final result obtained in this way is: 
P(x, r)=(1—2y ctnh2y) Ind-U 
—2y ctnh2y[2hk(y) —h(2y) JU 
+[—4y sinh2y(xr)-!(2—cosh2y) 
4 
+2y ctmhyi(o) +100) | Ay ctnhds| — cosh?y 


KT 


k—6 4 1737+ k 
+ sca ---— i| 


2r x 2K 7 


3r  3r 3 7 8 8 2x—P—Kr 
+-—4+—4+-4+1-—+--—+ 

2° 2k of Kr KK DK? r(x—1) 

1 =") pegs - 7 37 
== + yt esc] =e - | 

27 (x—1)? i kK 


4 4x 
11 1 1\? 
—4y tanhy(=—-) +4(-+-) 
2k K OT 
12 3« xk 1 sx 1 
ec 
ck 2r 7 x-1\7r 2 


ror K 1 2 3 
+60(0 +4 peter bao i| 
T KOT 2 =s«K 


zs 
+terms antisymmetric in x, 7, (30) 

where 
4 sinh’y= — (x+7) (30a) 

v 
no)=y- f udu ctnhu (30b) 
0 
1 
Gx =— 20-1 i In(l—u)du/u, (306) 
ima 


This is to be added to the same expression with « and 7 
interchanged (29) and the real part taken to get the 
correction to the Klein-Nishina formula (11). We 
discuss this result in the following sections. 

We might note here, however, that the real part of 
P(x, 7) is obtained by writing In|«| for In« and by 
writing for Go(x) expression (30c) with In(1—1) replaced 
by In(u—1). Since 7 is always positive, on the other 
hand, P(r, x) is always real. This is discussed further 
in Appendix W. 

The imaginary part of P(x, 7) is not without interest. 
as we shall show. This is given by = times the coefficient 
of Ink in (30) plus 7 In(1—x«) times the coefficient ° 
Go(x). Ps 

The loss of total intensity of a beam of photons is ® 
course proportional to the total cross section for . 
photon to be scattered out of the beam. But thi 
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decrease in forward intensity is the result of an inter- 
ference between the incident photon and a photon 
scattered exactly in the forward direction. Therefore, 
a is well known, the imaginary part of the forward 
scattering amplitude is proportional to the total cross 
section (formally this is referred to as the unitary 
property of the S-matrix). We can use this relation to 
check the imaginary part of P(x, 7) for the case of 
zero scattering angle (for which, of course, pi= 2, 
n= K=—T). ; 
We write P(x, 7) again as a sum 


P(x, 7) = (4mi)EDspin Dot (R™)2 AW), (31) 


and can show easily that »(W),=i/m if there is no spin 
change and no polarization change, and zero otherwise. 
This can be seen, aside from the phase factor i, from 
the fact that for small scattering angles the Klein- 
Njshina formula (13) is de=r,?dQ in the laboratory 
system. Since (R): also vanishes when 2(W), does, 


P(x, T)= MQ spin 2 pol (R®)o, (32) 


But, including all factors, the complete e‘-order matrix 
clement of R® is (according to reference 1(b)): 


@ Ane? 1 


xX=—- -=Doapin Dpor (RO +S”) 
a Gee) gevnin Zoot 16 : 


(33) 


and from the unitary property referred to above, it 
follows that the éofal cross section for Compton scat- 
tering to order e4 is just twice the real part of X. 
Therefore, 


Stotat(to order e*) = 2R.P.X = (27¢°/7)1.P.P(x, 7) (34) 


since S® has no imaginary part. That (30) satisfies 
this identity can be readily verified.® 


IV. THE INFRARED CASTASTROPHE AND THE 
DOUBLE COMPTON EFFECT 


In Sec. II we have derived the differential cross 
section for Compton scattering for unpolarized light, 
including radiative corrections, to order e®. The cross 
section took the form 


do=doxn [1+ (é/nr)8| (35) 
with 


6=-—U/U. (35a) 


There are two reasons why this result cannot be 
compared directly with experiment. In the first place 
U depends on the quantity \ to which no experimental 
significance has been attached. In the second place, it 
is impossible in principle to design an experiment which 
will guarantee that one and only one photon is emitted 
by the electron in the scattering process. The best one 
can do in an experiment is to require that if a second 
photon is emitted, its energy is less than some value 
o_o 

*W., Heitler, The Quantum Theory of Radiation (Oxford Uni- 


versity Press, London, 1944), p. 157, Eq. (53). Our Eq. (33) agrees 
With this result with 7 replacing Heitler’s 2y. 
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Rmax- This can be done, for example, by measuring the 
energies of the final electron and photon to some 
specified accuracy, the sum of the errors in the meas- 
urement being less than Rmax. In such an experiment 
one would be measuring the cross section (35) plus the 
cross section for the double Compton effect, dep, 
integrated over all possible directions of the second 
quantum and over its energy up to Rimax. 

These two difficulties, both related to quanta of low 
energy (if kmax is small), in one case virtual, in the 
other real, are actually related. That this should be so, 
can be seen physically from the fact that it is difficult 
to distinguish between virtual and real quanta of 
extremely low energy since, by the uncertainty princi- 
ple, a measurement made during a finite time interval 
will introduce an uncertainty in the energy of the 
quantum, which may enable a virtual quantum to be 
detected as a real one. It turns out in fact that dep, 
integrated to kmax, also contains an infrared divergence 
which just cancels the similar divergence in the radiative 
corrections. We are computing, of course, only to order 
e®, but the multiple Compton scattering of a given 
higher order will also cancel all the radiative infrared 
catastrophes of the same order. 

The problem is analogous to the perturbation theory 
treatment of the scattering of an electron by a potential, 
which has been considered by many workers, except 
that in our case the primary process is the Compton 
scattering considered in Sec. I. The cross section for 
emission of an additional photon q of energy w goes for 
small w as (d°g/w)(p2/p2"q—pi/p1-q)* times the Klein- 
Nishina formula. Since this diverges as w approaches 
zero, the probability of a single Compton process 
unaccompanied by such emission is zero. What is 
experimentally measured, however, is the probability 
that a Compton process occurs and that no other free 
photon is emitted except for a class of photons inacces- 
sible to the experiment. This is equal, to our order of 
calculation, to the probability of the single process plus 
the probability of a double process in which one of the 
photons emitted is in the inaccessible class. This class 
is, of course, determined by the design of the experiment 
and a single calculation cannot suffice for all experi- 
ments. However, one feature common to all experiments 
will be a finite energy resolution, so that a part of the 
excluded class must consist of photons whose energy is 
less than some energy Rmax. 

We will first therefore find that part of the differentia] 
cross section for double Compton scattering which gives 
rise to an infrared divergence. This will be integrated 
over all the directions of one of the photons, and over 
its energy from zero to a value kmax, which we shall 
assume is small compared to the electron mass, and 
added to the previously obtained corrected cross section 
for single Compton scattering. It has already been 
pointed out that Schafroth? has demonstrated that a 
cancellation of the infrared divergence occurs in order 
e® when the double Compton cross section is added to 
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Fic, 4, Diagrams for the double Compton effect. Here p3= pitg:, 
pi=pi—g2; the momentum condition is pit-gi= potgotgs. 


the single scattering cross section, but we must obtain 
at least the zero order term in Rmax (for Rimax&m) to 
obtain a useful result. The completely differential cross 
section for the double scattering has been computed by 
Eliezer,!° but since we wish to make approximations 
and carry out an integration, it is simpler for us to 
obtain the desired cross section from the beginning. 

Figure 4 gives the diagrams necessary for computing 
the cross section for double Compton scattering. The 
photon momentum q3= (ws, qs) is assumed to be small 
in the following (ws«m). This, of course, implies a 
definite coordinate system. To obtain a finite result we 
assume the photon has a small rest mass A, so that 
@=*. Keeping terms only to order w3"!, we neglect g3 
occurring tn the numerator of the (rationalized) matrix 
element terms, and terms of order w3; compared to 
pe—m? and p?—m? in the denominators. 

We find that terms III and IV are not of the desired 
order, In view of the fact that we.are to make matrix 
elements between the free electron states u;, and ue, a 
factor (pitm)es (with e3 the polarization vector of ¢3) 
operating on the left of a, is equivalent to 2p,-e; and 
a factor e€3(pe+m) operating on the right of wu is 


wC, J. Eliezer, Proc. Roy. Soc. (London) A187, 210 (1946). 
When the class of photons inaccessible to the experiment does 
not consist simply of those below a given very small energy kmax 
(but consists, for example, of those in a given solid angle, or with 
a limited momentum component, or having energies too large to 
permit the approximations we have made) the contribution which 
these events make to the measured cross section can be obtained 
from Eliezer’s formula, Explicitly, one must add to our result 
(39) the cross section for the double process given by Eliezer, 
integrated over all the photons in the class inaccessible to the 
experiment but which also exceed some arbitrary very small 
energy kmax. The sum, of course, will not depend on kmax. 
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equivalent to 22-e3. Thus, with g3 small, we get 


()=—Rpr-es/pirgs,  (I])=—Spr-es/Pr-qs, 
(V)=Rpo-es/p2°Ga, (VI) =Spo-e3/por 3 we 


Adding these we find the matrix for the double 
Compton process: 


R4s(F 828 


pods Pra 


Taking the absolute square of (37) and averaging over 
polarizations and spins in the usual manner, it is clear 
that we obtain the Klein-Nishina cross section dog x. 
(Eq. (11)) multiplied by the following factors: (a) 
d'q3/(2m)’, the density of states for g3 (neglecting its 
effect on the momentum balance, and therefore as- 
suming it is emitted independently of gz), (b) ¢, from 
the additional interaction vertex, (c) 27/ws, the 
normalization factor for the photon qs, (d) 


pres ne) (= pi ): 
bogs pigs pegs pres ; 


We collect these factors and integrate the photon 
momentum over all angles and from q;=0 to the 
sphere | q3|=Amax, where Rmax&m. Thus," 


) |qs|<«m. (37) 


Dipol 


é 
dop=— dox.n. 
fe (27)? ae 
faa] =Amex 2 gq 
x if ane )—. (38) 
las] =0 pods pirgs/ (qs d”)} 


Observe that if we replace dox,.w, by the cross section 
doo for an arbitrary process of one electron, the result 
(37) is valid for that process with one additional photon 
of small energy in the final state. For if the “small” 
photon q; is emitted from an electron line of momentum 
pb where ~?+m’, its effect will be negligible. The only 
diagrams contributing to the process with g emitted 
will be those of the original process, modified by the 
emission of g either before or after the original process. 
Thus the factor of (R+S) in (37) will always be the 
factor modifying the original matrix element (for small 
q) and the result (38) is the general result for an arbi- 
trary process of one electron, doo replacing dox.n.. 

It is most convenient to impose our restriction 
Rmax<m in the laboratory system.” In this case, the 
result of integrating (38) (expressed in terms of invari: 
ants) is 


dop=— (2/m)dox.n.{2(1—2y ctnh2y)[In(2Rmax/d) 
—$]+4y ctnh2y[h(2y)—1]}, 6 
with y and A(y) defined in (30). 


This expression (with 4=0) has been obtained previously’. 
See, for example, R, Jost, Phys, Rev. 72, 815 (1947), and F, Bl 
and A. Nordsieck, Phys. Rev. 52, 54 (1937). 

# J. Schwinger, Phys, Rev. 76, 790 (1949) has integrated (8) 
for the case of the scattering of an electron in a potential. 
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When (39) is added to (35) the effect is to replace 
the quantity 

{2(1—2y ctnh2y) Ind—4y ctnh2y[2h(y) —h(2y) ]}U 
in U by 
{2(1—2y ctnh2y)[In(2kmax) — 3] 

+ 8y ctnh2y[h(2y)—A(y)—4]}U. (40) 

We have now arrived at a physically understandable 
result as the quantity Rmax which replaces \ in U“ is 
the sum of the experimental uncertainties in the 
measurement of the final energies of the Compton 
scattered photon and electron. Our result can be com- 
pared with an experiment providing the energy resolu- 
tion'® is known and Rmax is sufficiently small. 

Under the limits of validity of our formula, the term 
in (40) containing Rmax is positive, and thus makes a 
negative contribution to the cross section. As the 
energy resolution of an experiment improves, it is thus 
found that the measured Compton cross section gets 
smaller. This is reasonable since we are eliminating 
from our observations more double Compton events. 

The expression (35a) for 6 with U® given by (29) 
and U given by (10) is valid also for the correction to 
the two-quantum pair annihilation and the two- 
quantum pair production processes, provided that for 
the former problem we replace x by —x and for the 
latter 7 by —7. This occurs because in writing down 
the matrix element we represent the emission of a 
photon by —gq and its absorption by +g, and because 
a matrix p representing an electron also represents a 
positron of four-momentum — p. However, the infrared 
divergences in these problems are not compensated by 
the corresponding three-quantum processes!® (which 
are not divergent) but by the effect of Coulomb inter- 
action. This will not be discussed further in this paper. 


Vv. EXTREME RELATIVISTIC LIMIT 


The Compton formula (12) can be written in the 
laboratory system, with scattering angle 9, as 
(x-+17)/7= }x(1—cosg). (41) 
We assume |«|>>1 and consider the three cases listed 
in Table I. This table also lists the approximations 
made in obtaining the formulas for U and U“) for the 
three cases and the corresponding conditions on the 
laboratory and center-of-mass scattering angles. The 
energies (in units of mc?) of the incoming and outgoing 
quanta are represented in the laboratory system by 
#1 and ws, respectively, and in the c.m. system by ». 
The results are as follows: 
Case I, 
U=2 
U®=4(1—2y ctnh2y) Inkn—8y ctnh2y[2h(y)—A(2y)] 
+4yh(y) ctnhy+ In| «| (4y tanhy— 1) 
~~ 2’ —4y tanhy+3— (In| x])?—72?/6. (42) 


Pe Note that the two quantum pair processes are symmetric 
‘ith respect to interchange of p; and p:2 so that (37) vanishes, 
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Case II. 


U=—Le/)+(/0)] 
U=U{(1—2y)G-+2 Ind)-+2y"— 42/6} 


“dla ITO) 
saad o(2e2) [CoD 


—In(14=) tn +. (43) 


K 
1+- 
? 


T 


K 


Case III. 
U=—k/r 
tH1 1 
aa 
t  2r(r—-1) 


3 

UOs u| 2(1—2y) Ind--Ine 2y—- 
a 1 +r 3.2 

4 Gin (+5) ++}, (44) 
3 7 2 267 


The corrected cross sections, in the relativistic limit 
are given in the laboratory system by 


da = (ro°/2)dQ(7?/ x?) [U — (e?/r)U ] (45) 
and in the c.m. system by 
da = (r¢°/8v?)dQLU — (e/r)U ] (46) 


with ro= &/me?. 

As we have explained in the previous section, for 
actual comparison with experiment one must add to 
(45) the cross section for double Compton scattering 
(35) which ts valid, of course, only in the laboratory 
system with kmaxm. If we write (39) as 


dop=(—2/x)(r?/2)dM(7?/2)U p (47) 
then we must replace U“ in (45) and (46) by U4 Up. 


For our three cases, we get: 


Taste I. Approximation made in obtaining the extreme 
relativistic limit of (30), expressed in invariants, laboratory 
system quantities, and c.m. system quantities. 


Lab system 
Case Defined by Leads to conditions c.m, conditions 
IT [(xt+r)/t|Kt [x] ert wor, tan?(8/2)<«1 
1—cosy1/a1 
(¢ near 0) 
I [(x+7)/r|~1 m1 \ w: Near wi/2,  cos’(8/2)>>1, 
[xtr[>>1f 1-cosewi/a,  sin?(6/2)>>1 
(g near 
(2/e1)4) 
w, near I, tan?(@/2)>>1 


PT 


IW [(x+7)/r[>>1 [«tr| og 
|x| 1—cose>1/e1 
(g>>(2/e1)}) 
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17,6 Mev 


n 
oe) 20 40 


60 60 oO 120 160 180 
Laboratory Angte > (deg) 


Fic. 5, Plot of 8’ = —U®/U—2(1—2y ctnh2y) Ind for 2.62 Mev 
and 17.6 Mev as calculated from the exact expressions (29) and 
(30) (solid curve). The dotted curves are calculated from the 
extreme relativistic formulas (42), (43), 


and (44), and are num- 
bered accordingly. 


Case I. 
Same as (39). 
Cases IT and III. 


Up=U{2(1—2y)[In(2kmax/d) — 3] 
+4yLyt (m?/24y)—1]}. (48) 


In Fig. 5 we have plotted a comparison of the exact 
expression for —U“/U (leaving out the term propor- 
tional to Ind) with the limiting cases expressed by (42), 
(43), and (44). That is, if we write U =a In\X+5, we 
have plotted —b/U. Especially simple formulas result 
for 6 in the extreme relativistic limit for the cases p=0 
and g=180°. At zero angle (where incidentally, the 
double Compton effect and the Ind term in U® vanish) : 


b(0°)=—Inr(1-+In7)+ 1.355. (49) 


At 180 degrees, 
b(180°) = —4.225U. (50) 
Another simple case results from the condition 
x= -—27, in the extreme relativistic limit. This corre- 
sponds to 90° scattering in the c.m. system. Here we 

get U=5/2 and 

5(90°, c.m.)= (2? ~ 3y—2.29)U. (51) 
TaBLE II. Percent correction to the Compton cross section for 
unpolarized light arising from Eq. (30), excluding the term 
proportional to Ind, at zero degrees and at ninety degrees in the 


c.m, system ; computed as a function of the laboratory energy of 
the incident photon from the special equations (49) and (51). 


Laboratory energy —(e#/r) (0°) /U —(e2/x)b(90°, ccm.) /U 
Mev percent percent 
50 3.80 0,32 
150 5.26 —0.87 
300 6.41 —2.13 
1000 8.80 —4.35 
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A few results for high energies computed from (49) 
and (51) are given in Table II. At 180°, the quantity 
—(&/m)b/U is +0.98 percent. 

VI, NONRELATIVISTIC LIMIT 


In the nonrelativistic or Thompson limit our results 
will be equally valid tn the laboratory and c.m. systems, 
since we will keep only the first nonvanishing terms. 
In the c.m. system, we let the scattering angle be » and 
|qr| =| a2! =o. Then 

k= —2p1-Qi= —IA(1+")}o+ 7] 
T= 2p1-G@= 21+ w)tw+w* cose ], 


so that for w<1: 

k= —20(1+o++?/2+---) 

T= 20(1t+w cose+u?/2+---). 
Since x depends only on w and dr=w%dQ/m, Eq. (11) 
becomes 
da =(e4/2m?)dQ(1—~ 20+ Qa+ - + -)[U—(2/) UM]. (84) 


In Table III we give the nonrelativistic limits of 
functions occurring in U“ (Eq. 30). A straightforward 
calculation then yields: 


(52) 


(53) 


U=1+cos*p+-0(w) (53) 
UW = —(4/3)0%(1—cosy)U Ind 
+(1+cospe+cos?y—4 cos? y)4w Inw+0(w*). (56) 


The angular dependence of the 4w? Inw term is plotted 
as f(g) in Fig. 6. Expression (56) disagrees with the 
result of Schafroth.’ 

Since (1/x)+(1/7)=4(1—cosg)—(w/2) sin’y in the 
c.m. system and the same invariant is $(1—cos@) in 
the laboratory system, with @ the laboratory scattering 
angle, it is clear that cos@=cosy+0(w) for given «, 7. 
Also, since «+7=—2w,w»(1—cos0) = — 2w*(1—cosy) 
and w,;=w2+0(w), we get that w= o+0(w*) =u. and 
therefore (56) is valid with interpreted as the labora- 
tory scattering angle and w the incident photon energy. 
Equation (55) is valid, with this interpretation, to 
order w’. 

The double Compton effect (39) gives in this limit 
(with Up defined as in (47)) 


Un=—(4/3)w(1—cosg)U In(2kmax/A)+0(w?). (57) 


It will be observed that all the corrections vanish in 
the zero energy limit. 


APPENDIX W. ON THE TRANSCENDENTAL 
FUNCTIONS G,(x) AND A(y) 


The complete expression for the radiative correctioD 
(29) is expressed in terms of the relatively unfamiliar 
transcendental integrals Go(x) and h(y). These can 
both be expressed, however, in terms of one of the 
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so-called Spence functions,'* namely 


1a)= f In(l—u)du/u (Al) 


which we shall consider briefly. 
It is well known that L(—1)=2°/12, L(1)= —7°/6. 
If «<1, 


L(x)= = f were +++ )du/u 


x pail 
=- (#45 (a2) 
If x>1, 
Lay=La)+ f In(i—u)du/u 
=10)+ fal 1—u| tin jdu/u 
= L(x) ain Inx (A3) 


where 


La= f In| 1—wjdu/u. 
0 


For computational convenience, we can also write for 


x>1: 
Tonia: . in —\* 


—4q°— L(1/x)+}(nx)’, (A4) 

and in a similar manner 
L(—x)=(9?/6)— L(— 1/x)+3(Inx)’. (AS) 
Since x=(m’—>p;*) is always negative, Go(x) 


=2«"(L(1—«)—L(1)] has an imaginary part whose 
sign is not determined by (A3). To fix the sign we must 
recall that according to the scheme of reference 1(b), 
all photons and electrons are considered to have a 
small additional negative imaginary mass. Thus x has 
a small positive imaginary part, ie., x= —|«|-+78 with 
é vanishingly small. Therefore 


Go(x) = 2K LE.(1 — x) — L(1) J4+-ie2e! In(1—x).  (A6) 


(Similarly, the term Inx in (30) is equal to In| «| +77.) 
We might also note that since 7 is always positive, 
Go(r) has no imaginary part. 

Thus if —>1, 7>>1 (from A4, AS): 


Go(x)e2x- [4 (In | «| )®— (n2/6)-+-in In| x| J (A7) 
Go(r)~27"[(Inz)?*+ (2?/3) J. (A8) 


“For references see Fletcher, Miller, and Rosenhead, An 
Index of Mathematical Tables (Scientific Computing Service, ‘Ltd., 
London, 1946), p. 343. 


239 


18F f(g) 


a a 
° 20 40 60 80 woo 120 140 60—S—«*BO 
Loborasary Angie ¢ (deg) 


Fic. 6. Angular dependence of nonrelativistic limit of UM 
(without In) term). 


Our other transcendental function 
uy 
yay f udu ctnhu 
0 
can also be expressed in terms of L(x). Integration by 


parts gives 


h(y) = In(sinhy) — rf In(sinhu)du, (A9) 


Letting t= e~*“ yields In(sinhz) = In(1—#)—1n2é and the 
integral is obtained directly : 
hy) =In(2 sinhy) — y/2+ (2y)“*[w?/6+ Le) J. (A10) 


If y is sufficiently large that we can neglect e¥ com- 
pared to unity, we get 


h(y)o(y/2)+ (9?/12y). 
APPENDIX X. TABLE OF INTEGRALS 


(A11) 


In this appendix we will simply list the integrals 
which enter this problem, reserving for Appendix Y a 
discussion of the methods used. To simplify the presen- 
tation of the integrals (which occur also in other 
problems) we introduce the following definitions: 

For factors of the denominator we write k?—2f1-k 
= (1), R—2pok=(2), b—2ps-k—x=(x), B= (0). We 
write for frequently occurring vectors 


bs= Pith, ps=hi- 
2po= pit po, 2qo= 91+ G2, 
20= pi- Po=qe—M- 
Tape III. Nonrelativistic limits of functions occurring in 
Eq. (30). These expressions are valid in either the c.m. or labo- 


ratory system with ¢ the scattering angle and w the incident 
photon energy. 


oT EEE D/Mlct P09 


Gia? 2(1-+«/2+2/9+ «--)—2 In] x[ (1+ «/24+02/34+ ---) 
In| «| =In2e+e+0(w!4) 
Inr=In2w+w cose+4w?(1—cos? gy) +0(w) 
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The integrals are defined by 


A=8i f dth/(1)(2), BY =8i f d*/(1) (x) 


CO =i f d'k/(1)(0), D=8i f atk /(0)(x) 


F=8i f dth/(1)(2)(x), GO=8i f dtk/(1)(x)(0) 


H=8i f dtk/(1)(2)(0),  J=8i f dth/ (1) (2)(x)(0). 


Integrals B®, C®, G® are defined as BY, CY, GH 
but with (2) replacing (1). Their values.are obtained 
from those of B®, C®, G® by replacing p; by po, 91 by 
qz, and Q by —Q. We use the notation, as in reference 
1(b), that 


Fegnen=8i f (15 hej Bokx)at#/(1)(2) (0), ete 
Let: 

w= Q= —sinh’y=}(x+7), 

a=—InA? 

b=1—ycothy 

c= (x/x—1) Ink 

d= 2y csch2y= (1—b)/(1— 4) 


(y is real as @?<0) 


aq 
L(s)= f —In(t—1) 


h(y)= (u/s) fw cothudu 


v= K+ 4u(1—x«). 
Integrals 

Ao=2a—4b+ 2 

Bi =20-4+-2= Bo 

Co = 2a—2=Co 
Do=2a—2+-2¢ 
Fo=y* csch*y 

Gi= (2/)[L—*) LU) ]=Go 
Ho=2dl—Ind+h(2y)—h(y)] 
Jo= (2d/«)[24(9)—h(2y) — In(«/2)] 
A,= (2a—4b+ 3) Doe 

B& = (a+4) (Piet fae) 

C= (a-+4)p ie 
Do=[(x/x—1)(c—1)+-4+4 ]pse 
F,= Fopoe+LFo— (26/1) ]q06 
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Ge =[Go— (2¢/x) ]pret (2/x)[Go— (2—«/x)c— 2 ]916 
H,=2dpo. 
vJ g=([2uF o— (2u—x)Z— KxGo |Poo 
+[(2u— x) Fo+2(1— u)Z— (2— x)Go ]qo0 
with 
Z= Hot cJo=2d[—Inx+hly)] 
For=F ePor +L (Fo 20/#) poet (Fo-+ Yo 2b/ 1) qoo ]qor 


7 a Y.Q.0r+LuY ot fe+2 ]é0- 
with 


Yo= (Fo—2b—1)/2u 


3x—2¢ 1 
Gui®= Puc] Go | 


k-1 « x-1 


3 
+ (rete dubs)| “Got 
K 


x—-1 x 


2°—9x+6c¢ 6—5Kk1 

One | 
6 44218412 

+a] 36+ BG) x 


2°+9x—1217 1 x—2 
]+8-[260-34 2042] 
K 


K(x—1)k 
H.-= dpoaPor-+ (b/u)QOs-+45er(a—2b-+4) 
Fer= AeQet+Boport Vobsrt+ Sor 
where: 
€= Fo— acQe— Bofoo— Yo 30 
4uac= G, —G,@ 
0Bo= KF —(2—«)(Het «Je) + (1— x) (GO +E.) 
Vy e= (2u—«)F+2(1— py) (Hot «J o) 
—3(2—K)(G+G.) 
Sor=5er— (QeQe/u) + 4(1— x) (Poohor/?) 
—2(2—«)[ boehsr+t Prehor)/] 
+4(1— 1) (Psepsr/?) 


[see=1] 
J orp= XerQptBorPopt VorP3ptEaSrpt ErSap 
where 
€e= Fe— GerO1— BorPor— YorP 34 
A potes= Ger —Gor 
Bor =KF oy — (2—K) (Hor+eF ox) + (1—*) (Gor +Ger™) 
VY r= (2u—k)Fop— 2(1— 4) (Hort kJ er) 
—4(2—x)(Go+Gor™)- 
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APPENDIX Y. ON THE INTEGRALS IN THE 
CORRECTED CROSS SECTION 


In this section we will describe the methods by which 
the integrations occurring in the e‘-order matrix element 
have been performed and will give some examples of 
the calculations. 

Those integrals which are scalars (those with no k 
in the numerator) have been done by the parametric 
method discussed in the appendix to reference 1(b). 
Those which are tensors (having one or more &’s in the 
numerator) can be done either by the parametric 
method, or can be derived by an algebraic procedure 
described below from those of lower tensor order. 

We now do several examples of the integrals to 
indicate the methods employed. 


(a) The Two-Denominator Integrals 


These all have the form x(A®) (where we need only 
the case of A much greater than any of the momenta 
involved in the problem), with 


jes f (15 be)(2—2p1-k~— Ay) 


X (B— 2po-k—Ae)~*(— A?)(R?— A®)—"d4k.- (A12) 


To reduce this integral and those considered below we 
use the methods of reference 1(b). 

In the first place we combine the denominators by 
making use of the relation 


1 /ab= f dyfay+o(1—y)F* (AI) 


and similar expressions for 1/ab?, 1/ab3, etc., obtainable 
from (A13) by differentiation. Thus x becomes 


t 1 
x=8i f f f (1; Re)dy2ede(— A2)d4k 
0 0 


<[#—22p,-k—zdy—(1—2)A2]-?_-— (A14) 
with py=piyt+poll—y); Ay=Arvy+A(1—y). Using 
(12a) of reference 1(b), we get 
i 1 
x= ff 2udedy(—A°V(1; ahve) 
os X[epe+2dy+(1—2)A7 2 (ALS) 


To facilitate the work we observe that in the limit of 
very large A? (with 61) 


f (1~2z)"dz(— A?) 
0 P(z)+A2(1—2z) 
1 (1—2)"dz(— A’) 
1s P(1)-+ A*(1—2z) 
S-1/n, n>0 


In[P(1)/A2], n=0. (A16) 


241 
Writing 
(1} 2Pye)=(1—(1—2) | [1 —2)? 211-2) +1 Joyo), 
(A15) becomes 


1 


244A 
x= f 24y| (5 Pe) os : 
0 


Ae 


+(13 hye) 7. (A17) 


The second term gives (2; 3(p:6+20)). In the first term 
I 

nor fh 24y(15 Pre) nC 8y)/A8], (ALB) 
0 


notice that with 20=(p:— 2), 
py = pP+O'y*+ 2po- (pi— po) y, 


A;=m’—p2, A.=m—p?e 
so that 


P+ Ay= m+ 40K) = m+ PL Qy~1)?—1], 


Now let m=1, Q?=sin’@, and 2y—1=tana/tané, 
so that dy(sec?a/2tand)da. We get 


p+ Ay = cos’é seca 
so that 


8 
Xe= J (seCa/tand)da(1; pyo(a)) 
a) 
In(cos’é sec’a/A*). (A19) 
This integral can be done easily. For example, 


6 tané 


f seCada In(sec?a) = if dy In(1+4*) 
= tang 


=4 tané[In(secé)—1]+48, 


the last integral being performed by parts. In this 
manner we obtain integrals A, B, C, D. 


(b) The Intregal G,‘» 


As an example of the three denominator integrals we 
integrate Go”. The parameterization method gives 


Gy = 8i f dk(k2— 2p1- b)-1(K— 2s k—«) (2) 


I 
‘i f BR2dyxdx(k’—2p,-k—Az)-* — (A20) 
t) 


with 
py=(1—-y)Pr+ybs= Pit ya = pe=*Py, 
Using (12a) of reference 1(b) again, 


Al=xyx. 


1 Al 
GoW= i f Qdady(xpP-- yx). (A21) 
o~“o 
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Since 


I 


f dx(ax+6)-! =a In(6+<a)/b, 


0 
and since p,’=1—xy 
i 


Go) = -f 2dy(1— xy) Inky 
0 


2 r' dv 
---f — In(1—). 
KY 1,40 


In the last step we have let v=1—xy. We obtain 
finally : 


Go =Go® = Go= (2/n) [LA—«) —L(1)]. 


(A22) 


(A23) 


(c) The Integrals H 


These are the same as those done in the radiationless 
scattering problem. They are given in reference 1(b), 
appendix. Equations (23a), (24a), and (25a) should 
have the signs of their left-hand sides changed. 


(d) The Integral Jy 
Jo= 8i f ake 2p1-k) (RP 2p3-k— x)! 


X (R— 2po- k)“H(R—2)-1, (A24) 
We have given the photon & a small mass \ as an 
infrared cutoff. When this integral is parameterized it 


becomes 


Tet Al 
Jo= 8t f f f f 6dxdydz(1—x)z°dsk 
0 40 Yo 


X[R— 2epe-k—xex— (1-2) ]-4. (A25) 
By (13a) of reference 1(b), 
8i f ak(k— 2p-k— A)+= —4(p?4 A) 
so that 
I i 1 
-f f f 2dxdydz(1— x)z? 
o “0 tt) 
X[epP+azx+M(1—2)]-? (A26) 


with p.=(1—x)pytxps, py= yhit (1—y) pe. 
We break the x integration into two regions (e««1): 


1 i i i i i i e i 
faxf arf dem fayf ax fact fay f ax fae 
0 r) 0 0 ‘ 0 0 0 0 


AND R. P. 
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(1) (iI) 
In (I) we let A—0, getting 


= fay ff ae f 2aa—nleperen 
=f —— (1—x)dxdy 


xx(p2?+ xx) 


In region (II), since x is small, we neglect x? compared 
to x: 


i I € 
(II) = f dy iE dz f 227dx[2?p,y?-+ 2xz%a 
0 0 0 


+exnxe+ (1-2) J 


(A27) 


i i 
-2f dy { dzez*[z*p+ A2(1—z) Jo 
0 0 


x a 2ev’a-++-zxe+ (1-2) J 


with a= Py: (ps— py 

In (II) we now ee the z integration into two 
regions; O<z<z, and z.<z<1 such that V<z2p/ 
<z.xe. Thus for z<z_. we neglect z relative to unity 
and 2°p,? relative to zxe. For z>z, we neglect \. There 
results: 


(A28) 


f 2edzdy 2 Ke 
te Pyelz(py + 2ea)+xe] xp,? apy 


t 2ez*dzdy 
0 ('py?-+)’) (axe+’) 


ie 2edz (——* Ne ) 
dy (p+ 2e)NetpPr® next)? 


1 22D, 
=— In (neglecting terms of order A). 
Kpy ‘ 
Adding these together we get 
i dy é 
(I)= | —In (A29) 
0 Kpy = p,? 

We can now use the same substitution for y as that 
leading to (A8). Notice that in this case Ay=0. We get 
dy/ Kp, da/sin20, p,?=cos’@/cos’a so that (II) be- 
comes: 


a= f et [n+ n(co8a) 


—9sin26L > cosé 


46 Ke § Ade 
= In———+ f 
sin2@ dA cosé o sin2é 


In(cosa). (A30) 
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RADIATIVE CORRECTIONS TO COMPTON SCATTERING 


We have still to finish evaluating (I), Eq. (A27). 
First investigate the denominator .’+.«x. Since 
pe=i1—xand 2p3-py=2—«, 


p2=(1—2)*p,?+ 21 — x) +2(1—%)(2—«) 
exes (1—2x)?(p,2—-1)+1. 


Letting 1—x=sing/siné, 2y—1=tana/tand with Q? 
=sin’@, we get 


p= sin’6[ (tan’a/tan’o)~— 1], 
pe+xx=sin’d ctn’o(sec’a—sec’p)-+ 1= cos*¢ sec’, 
—dx=cos¢d¢/sing, dy=seCada/2 tang. 
The integrand of (I) becomes 


1 sin@ (=*\(= \(=)(==*) 
aaa) sind 7 \cos’ sind tang 


1 doda 


«x sing siné—sing 
and 


6—« tané dd ¢ da 
w= f oe i aa ee eae 
0 x /_, sin6(sind— sing) 


6—« tan 2 
0 


« sind(sind— sing) 


Integrate (A31) by parts, using No. 436, Dwight’s 
Tables of Integrals, to get 


46 e tan@ 
re 
«x sin26 2 cos@ 
§ Ado sink(6—@ 
+f in( . ") (A32) 
9 sin2é cos3(8+¢) 


which can be put in the form 


()=- 


In(cose) 


§ Ada 
+f In(tana) 
0 sin26 


46 e sind § Adda 
(=~ In -f - 
o sin2@ 


xsin2@ cos?@ 


and can now be added to (A19). 
This gives the result, 


—49 
Jo= [m 
x sin26 


8 
+04 f dv intano) (A33) 
d\ tané a 


Another integration by parts and the substitution 
9= iy gives the result in the table: 


Jo= (2d/x)[2h(y)—h(2y)—In(«/d)J-— (A34) 
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(e) The Integrals J,, J+) Jory 


From the preceding work on Jo it may be supposed 
that to attempt these more complicated integrals by 
the parametric method would involve great labor. 
Fortunately there is a way to reduce these integrals to 
a combination of integrals of a lower tensor order, and 
those of a smaller number of denominators. We will do 
J, as an example but it should be clear from this how 
Jer and J,,, are done. This method can of course, be 
applied also to G., Go,, etc. We shall be able to express 
J, in terms of Jo and the integrals F, G having only 
three denominators. 

Using the notation indicated in Appendix X for the 
denominators, (1)=k’—21-k, etc., we write 


k,d'k 
Je=8i 
if (1)(2)(«)(0) 


= aPiet Bpoot YP30 (A35) 


a, 8, y being scalar functions of p;, po, and 3. The 
vectors 1, pe, ps will in general define a three-space. 
It is clear that the vector J, cannot have a component 
in the direction P which is perpendicular to this three- 
space, since for &, in the P direction, the integrand is 
an odd function and therefore J, must vanish. 

If we now take the scalar products of J, with p1, po, ps, 
since 2p;-k=(0)—(1), etc., we get 


pp Opiate 
2p I= f (1) (2)(x)(0) 
d'k d'k 
ya pce (eases ste ed 
if (1)(2)() ‘J (0) 
. 2pe: kdtk 
aor J—8i f (1)(2)(0)(0) 


dik dtk 
28 a | eee; 
if (D0) if (4) (2)(0) 


(A36) 
2p3-J= si f annoy 
dik atk 
8 ae ae 
xdsk 
= aaa 
= Fy—Ho—xJo. 


Taking also the scalar products with the right-hand 
side of (A35), we get the set of linear equations: 


Fo—Go= 2a+B(1—}x— $7) + ¥(2—k) 
—Go= a(1—Fx—$7)4+28+ y(2—x) (A387) 
Fo— Ho— xJo= (a+8)(2—x)+2y(1— x). 
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J, can now be readily obtained by solving these 
equations for a, 8, and y. 

To obtain J,, we write 


kok,d'k 
J or= si f 
(1)(2)(x) 0) 
= AcPirt+Bepet YoPart or, 


Qa, Be, Yo being vector functions of f1, po, p3, and € a 
scalar function of the same variables. The tensor €6,, 
now occurs on the right-hand side as it is possible for 
J, to have nonzero components depending on P. (If 
ke=k,=Rp, Jo, need not vanish.) If we take inner 
products with pi, po, and ~3 now we get 


Fy—G6 = 2ae+ 28e(1— bx 37)+ Yo 2— K)+ 2€Pio 


for 2p1,Je, and similar equations for p2 and 3. This 
gives us three equations for the four quantities a,, Be, 
Yo, € However, there is the additional independent 
result obtained by summing J,, over c: 


. f eked 
= 2 ae 
(1) (2) (x) 0) 

= Fo= AsPiot BoPro+ YoP3o+4e. 


Solving these four equations we obtain J,, algebrai- 
cally in terms of simpler integrals. In a similar manner 
Jer, can be expressed in the form given in Appendix X. 


(A38) 


(A39) 


APPENDIX Z. EXAMPLES OF THE CALCULATIONS 


We shall here illustrate by two examples the method 
of evaluation of the transition amplitude. 

The matrix T (18) can be written, if we rationalize 
the denominator, as (~’=m’, q?=0) 


T= f (k—2p-k~2q-k—x)“1(k—2p- kb) 


X@RR~C(R?)T (A40) 

with 
T= y,(p+q—k+m)e(p—k+m)y,.  (A41) 
7 can now be split into terms involving no 2, one R, 


and two k’s, and the results of the integrations over 
k-space inserted from Appendix Y. Thus, 


T=yu(p+9q4+m)e(p+m)y.— yuke(p+m) yu 
—yulb+g+m)eky,+yukeky, 
= 2p(p+ q+m)e—2pke+ 2ke(p+¢) 
—4m/(e-k)—2kek, 
where we have used Eq. (4a) of reference 1(b) and the 


fact that the matrix T operates from the left on a state 
u such that pu=mu. Inserting the integrals and 
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grouping terms we now have, 


8iT =2p(p+¢q+m)eGo—2(pyse—yoe(b+ q)+2me.)G, 
—2ye@y:Ger. (A42) 


This expression can now be further simplified. For 
example, the term in G, is (using Appendix X) 


—2[Go~ (2c/x) [_pe— pe(p+q)+2m(e-p)] 
— (4/)[Go— (2—x/x)c—2 _pge—ge(p+4q) ] 


with x=—2p-q. Since p?=m’, g?=0, e-qg=0, and quite 
generally ab-+ ba=2a-, this term becomes finally 


—2[Go— (2¢/x) ](2m*e— peg) 
+ (4/x)[Go— (2—«/x)c—2]-[2(e-p)q-+ xe]. 


Combining this with the terms in Go and G,, in (A42) 
(expanded in the same manner) we obtain the expression 
for T given in the text (19). 

To illustrate the simplification that occurs upon 
taking the spur, for unpolarized light, consider the 
term J (21). This may be decomposed, as was T above, 
into a sum of terms involving various numbers of k’s 
in the numerator. For example, the term involving 
three R’s is 


= f yukeoke ry d'k/(1)(2)(x)(0) —(A43) 


using the notation of Appendix X. If we replace e;, by 
Ya and @2 by yg, this gives a contribution to the spur 
P(x, 7), Eq. (28), of a numerical factor times 


if e()d4/ (1)(2)(x) (0) (A44) 


with 


g(k)=Sp[(pot+m)yukypkyakyu(prt+m)W | 
= «7 {4(p1-k) (po k) (bs) 
+hL3m*« bit po): k— (br: prt 2m’) (pak) J} 
+1 *{4(pr-k)(b2-k) (pak) 
— mk pit prt pa): k}. 
It will now be seen that the factor &? will cancel the 


factor (0) in the denominator of (A44) leading to the 
integral F, (Appendix X). Also, we can write 


2pr-k=—(E~2pr- B)+E= ~(1)+(0) 


leading to integrals G,,” and F,,. Thus it is not 
necessary, for unpolarized light, to use integral Jaro. 
(A44) can now be written 


2(x7PoePaet Tt Dope) (Per— Gor”) 


+ [4m (pret pre)— "(pi Pot 2m) pre 
—m 1 (Diet prot pre) Fo 


(A45) 


(A46) 
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THE PRESENT STATUS 
OF QUANTUM ELECTRODYNAMICS 


by Richard P. FEYNMAN 


California Institute of Technology, Pasadena, California. 


Fifty years ago at this Conference one of the problems most 
energetically discussed was the apparent quantum nature of the 
interaction of light and matter. It is a privilege to be able, after 
half a century, to give a report on the progress that has been made 
in its solution. No problem can be solved without it dragging in 
its wake new problems to be solved. But the incompleteness of 
our present view of quantum electrodynamics, although presenting 
us with the most interesting challenges, should not blind us to the 
enormous progress that has been made. With the exception of 
gravitation and radioactivity, all of the phenomena known to physi- 
cists and chemists in 1911 have their ultimate explanation in the 
laws of quantum electrodynamics. 


Stricktly speaking, “ quantum electrodynamics ”’ might be expected 
to deal only with the quantum theory of the electromagnetic field, 
and not with the theory of the motion of the matter which generates 
it or reacts to it. But conventionally, the motion of that matter 
whose motion is understood, namely electrons and possibly muons, 
is included, while the motion of baryons and mesons is not. I will 
use the term in the conventional sense here. If I wish to refer to 
the narrower field I will call it simply the “ quantum theory of the 
electromagnetic field ”’. 


Lorentz ®) showed in his 1911 report at this conference that 
beside an instantaneous coulomb interaction the electromagnetic 
field could be represented as a set of harmonic oscillators, each 
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driven by the transverse component of the current produced by 
matter in the corresponding mode. That the quantum theory of 
electromagnetic interaction results directly from the simple assump- 
tion that these oscillators are quantum oscillators obeying a 
Schrodinger equation was noted by Dirac 2. Since that time a 
bewildering variety of mathematically equivalent formulations of 
that idea have been made. These are published 3 in many articles 
and text books and I will assume that you are familiar with some 
of them, and will not discuss them further. Here I shall simply 
report first on the comparison of quantum electrodynamic calcul- 
ations with experiment, and second on some of the unanswered 
theoretical questions in this field. 


COMPARISON TO EXPERIMENT 


General remarks. 


Considerable evidence for the general validity of Q.E.D. is, of 
course, provided by the enormous variety of ordinary phenomena 
which, under rough calculation, are seen to be consistent with it. 
The superfluidity of helium and the superconductivity of metals 
having recently been explained, there are to my knowledge no 
phenomena occuring under known conditions, where quantum 
electrodynamics should provide an explanation, and where at least 
a qualitative explanation in these terms has not been found. The 
search for discrepancies has turned from looking for gross deviations 
in complex situations to looking either for large discrepancies at 
very high energies, or by looking for tiny deviations from the theory 
in very simple, but very accurately measured situations. 


High energy experiments. 


The experiments at high energy which are most significant for 
us here are those of the elastic scattering of energetic electrons 
(up to 1 Gev) by protons at appreciable angles “4. The scattering 
is very different from what it would be for an unstructured proton. 
The proton should have some structure, however, as a result of the 
unknown strong interactions between mesons and baryons. One 
usually interprets all the deviation as due to this structure. On the 
other hand some of it may be a failure of quantum electrodynamics. 
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According to quantum electrodynamics the scattering amplitude 
should be 


— 1 
(uz Yu uw) Ju @ (1) 


where q is the momentum transferred by the virtual photon, w and 
uy are electron spinors in and out and J,, is the matrix element of 
electric current between the nucleon states of four momentum P, 
and P». From relativistic invariance arguments Ju must have the 
form 


Yu F,(q?) ed Suy dy F2(q) (2) 


taken between proton spinors, where F, and F, are unknown 
functions of qg2. For a point particle F, = 0, F, = 1. 


It is difficult to say what would happen if electrodynamics failed, 
as long as the exact manner of a supposed failure is not specified. 
A conventional way to assume the failure is to suppose that the 
propagator is altered from 1/q? to 


1/q? (1 — q?/A?) (3) 
and to tell how large A would have to be for such a modification 
to remain undetected in a given experiment. 


In the proton scattering the effective F,(g2) is found to fall, for 
smaller q?, as 1 + q2/(560 Mev)? (q2 is negative). If F, did not fall 
at all, but the altered propagator were responsible A would be 
560 Mev. If A is much less than this the proton would look much 
softer and extended than it does. We can probably safely conclude 
from this experiment that A exceeds 500 Mev. 


According to (1) the scattering for different angle and energies, 
providing they correspond to the same q should all be related via 
just two unknown numbers F), F>. This will fail to be exactly so 
because of corrections, probably not large, due to the exchange 
of two or more photons. These corrections may be computed, 
although with some uncertainty due to proton structure. If a large 
deviation still persists it would mean that Q.E.D. fails in a very 
peculiar way — for example, that another virtual object of higher 
spin is exchanged, or that there is a new coupling of proton and 
electron so that they may combine to form a neutral heavy particle 
which disintegrates back again to proton and electron, etc. 
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So far most physicists believe that the general behavior of the 
functions F, and F, is understandable as a proton structure effect 
and therefore that Q.E.D. can be trusted to perhaps at least as high 
as | Gev. 


Clearly these uncertainties of proton structure would not arise 
if electrons or positrons were scattered from electrons. For example, 
there are measurements to 5% accuracy of the cross section for 
annihilation of positrons in flight 5’ up to laboratory energies of 
nearly 10 Gev. In the center of gravity system, however, the positron 
momentum is only 50 Mev and the virtual state momenta of im- 
portance are much lower still. The experiments agree with theory 
but this does not put a very great lower limit on A. The same 
comment applies to w~, e~ collisions measured © up to 8 Gev, 
also in agreement with theory. 


On the other hand, these experiments must not be treated too 
lightly. They are only uninteresting if they agree with theory. 
Although they do not yet involve nearly as high virtual energies 
as the proton scattering experiment, they test different things, 
such as the electron propagator, or the muon structure. 


The types of experiments at high energy which would more 
effectively test the predictions of Q.E.D. are discussed by J. Bjorken 
and S. Drell, P. R., 114, 1368 (1959). 


Energy levels in hydrogen. 


Turning now to precision low energy experiments, the classic 
experiment ‘7) is the direct measure of the 25,/2 and 2p,j2 energy 
separation in hydrogen, deuterium and ionized helium. It was the 
analysis of this experiment by Weisskopf and by Bethe which led 
them to discover a way to circumvent the divergent self-energy 
which, up to then, had bedeviled any attempt to compute higher 
order effects from Q.E.D. The need to put their ideas into a relativ- 
istically invariant form led to the formulations of Schwinger and 
of the author. The Lamb effect still remains one of the most delicate 
tests of Q.E.D. A comparison ‘8) of theory and experiment is given 
in Table I (after Peterman °9). 


Contributions to the Lamb shift arise from several sources : 
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(a) Virtual emission and readsorption of one virtual photon. 


Here in the initial state an electron is in a definite state (25,/, 
or 2p)/2) in the nuclear Coulomb potential. In the intermediate 
state it is in some other exact state of the Coulomb potential. The 
wave function for these states should be found by the Dirac equa- 
tion. The labor in doing this has been too great, so far, even for 
computing machines, because for each intermediate state n, matrix 


elements of the current times exp. iK - X must be found for every 


K and a double integral on K and n is involved. However for low 
energy photons the dipole approximation and Schrodinger wave 
functions can be used and the sums performed. This determines “9 
the constants K,(2,0) and K,(2,1). For higher intermediate energy 
it is usual to make an expansion of the intermediate wave functions 
as plane waves, perturbed by the Coulomb potential (hence an 
expansion in orders of Za). Combining the first term with the low 
energy contribution gives the largest part of the Lamb shift, items 
1 plus 2. Item 2 is separated here, for it is easy to understand as 
the correction to fine structure due to the apparent anomalous 
moment of the electron. 


The correction to include two potential scatterings is included 4) 
in item 4. To include three scatterings is very difficult, but Layzer 2) 
has shown that it is a quadratic form in /n(Za) and has computed 
everything but the constant term. This makes an uncertainty in 
the total effect from one virtual photon. It is unlikely that the part 
marked “ ?’’ exceeds + 10, so we may take this as a kind of limit 
of error. Terms of higher order in the potential are probably too 
small to be significant. 


(b) Emission and reabsorption of two virtual photons. 


This is of order one higher in a, and no large logarithm from 
low energy photons arises, so the effect is very small indeed. Al- 
though the magnetic moment part, item 7, has been worked out 
exactly 43), the potential spreading effect, item 6, has only been 
partially evaluated, in such a way that limits of error can be 
given 94), This work should be completed because the uncertainty 
here is the largest contributor to the theoretical uncertainty in the 
Lamb shift. 
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TABLE I. — (After Petermann)9. 


Lamb shift : 2S,/2-2P1/> (1/x = 137.0389) 


Order 


Formula, units Z4L(u/m)3 H D He+ 
1] Kz (2,0) 
1. a(Za)4 R _ pidge ele Wes (yet bcs : 10. 
a(Za) ad 2inZa + M + 7A In KO) 1009.85 1010.64 13168.4 
1 lm 
2. «(Za)4 Mag Mom ~d +-—) 67.71 67.77 1084.7 
2 4M 
3. a(Za)4 Vac Pol —; —27.08 —27.11 —433.9 
nu! 5 
4. a(aZ)5 2nd order V 3nZa (1 + —~— ~/n2 + —) 7.14 7.13 228.4 
128° 5 192 
5. a(aZ)6 3rd order V (aZ)2[3/n2Za + (4/n2-+-1 + pind + 2] —.25 + .07 —.25 + .07 —9.5 + 4. 
6. «2(aZ)4 4th order Rad 3.52 + .3) 24 + .13 24 + .13 3.9 +2, 
7, a(aZ)4 Mag Mom —.328 afr —.11 —.11 —1.7 
4la 
2 4 anes pe oe — — 
8. «2(aZ) Vac Pol san 24 24 3.8 
Zm aes Zm 1 
4 a ah 
9. a(aZ) M Finite Mass M [4.862 5 InZ] 36 18 2.5 
10. Finite size a(m/a)2 RP ae 12 + .02 .73 + .02 TA+1, 
Total Me. ..... 1057.74 + .22 1058.98 + .22 14046.1 + 7. 
Exp. Mc...... 1057.77 + .10 1059.00 + .10 14040.2 + 4.5 
L = a3 Ryd, ¢c/3x = 135.6353Mce, u/m = (1 + m/M)-! 
m = mass of electron, M =} mass of nucleus. 
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(c) Vacuum polarization. 


The effective potential from the nucleus is altered by the existence 
of virtual pairs of electrons and positrons created by the potential. 
This has the main effect given in item 3. A correction of relative 
order « has also been calculated 5), item 8. Corrections of order 
Za to the vacuum polarization are included in item 4 (the term 
5/192). In fact the vacuum polarization has been calculated ‘6 
for arbitrarily strong fields (arbitrary Za), but additional correc- 
tions to the term first order in Za are small even for Pb. 


(d) Finite nuclear mass. 


The biggest correction resulting from the finite nuclear mass is 
the correction to the probability of finding the electron at the 
nucleus due to use of the reduced mass uw rather than the electron 
mass m in the Schrodinger equation. Beside this the mass appears 
in the logarithms for item (1) and in the fine structure correction 
item (2) in a way that is readily evaluated. There are, however, 
additional corrections of order Zm/M from two photon exchanges 


between the electron and the recoiling nucleus. They are given 
by 7) 18) 


2K(21) , 83, 3 Ca, 
Zak 0) + 34 + 3 linZa + + 3 (1 — ind) 


and are included in item 9. (My figures seem to differ slightly from 
those of Peterman.) 


m 


(e) Nuclear structure. 


If the nucleus is not a point charge the potential near the nucleus 
is slightly altered, perturbing the s state but not the p state in first 
approximation. This is an effect calculated by elementary perturb- 
ation theory if the mean square radius of the nuclear charge 
density <R2> nue. is known. This can be got directly from scat- 
tering experiments and the effect evaluated. The error is just a 
reflection of experimental error. 


(f) Higher order terms. 


Terms in next order in a should probably contribute at most 
a few hundredths of a Mc to H and D and perhaps up to 1 Mc 
to Het. 
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The last three columns give the shift calculated in megacycles 
from each of these terms for H, D, and He+. They are calculated 
using 1/a = 137.0389. If 1/a is larger than this by ¢ the correction 
to the theoretical value for H and D is — 22e Mc which is almost 
certainly less than + .02 Mc (the present uncertainty 8) in 1/« is 
e = + .0006). 


The agreement between theory and experiment exhibited in the 
last line is excellent. The errors quoted should be considered more 
as limits of error than probable errors. The error in the theoretical 
estimate could probably be reduced by a factor nearly 10 by a more 
detailed calculation of item 6, and, what might prove even harder, 
an estimate within +1 unit of the constant term “ ?”’ in the 
expression for item 5. [It might also be possible to compute the 
total for one virtual photon to all orders in Za exactly, as described 
in part (a) above.] It may be very hard to reduce the experimental 
error, however, for the position of a line is being measured to about 
one-thousandth part of its width. 


What is the significance of this agreement, let us say, to + 0.1 Mc 
in the hydrogen Lamb shift ? Again, an evaluation depends on 
how you expect Q.E.D. to fail. If it is expected that the failure 
appears only at high momentum transfer, say in the photon pro- 
pagator, very little is checked here that is not already involved in 
the electron-proton scattering experiments. In the hydrogen atom 
the electron is successively scattered by the proton and if this, at 
very short distances, is not the ideal point charge scattering it can 
be corrected by using the directly measured scattering, without 
regard for the reason for the difference from the ideal scattering. 
For example, in the very unlikely event that Q.E.D. and proton 
structure effects are compensating each other in the proton scatter- 
ing experiments, they will compensate here too; the correct net 
effect being still given in item 9. 


On the other hand, one might contemplate a failure involving a 
modification of the propagator extending out to very large distances 
(compared to 10-!3cm), but having a very small coefficient. For 
example, suppose it is suggested that Coulomb’s law is altered so 
that the potential from a charge can be approximated by the form 
1/r4 +®) in the range of r of order 10-8cm. We can conclude 
from the Lamb experiment that ¢« is less than 10-19. This is because 
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the close coincidence of the 25,;, and 2p,/7 ievels, each of which 
has an energy of the order of 109 Mc, is a kind of accident involving 
the perfection of the Coulomb law. The Lamb experiment tells us 
that any modification that perturbs the 2s energy level (in a sub- 
stantially different manner than it disturbs the 2p level) must 
disturb it by less than one part in 1010, 


But perhaps the most satisfying aspect of the agreement of theory 
and experiment here is that it checks the general theoretical view- 
point. There cannot be much argument that the effects which we 
ascribe to virtual photons or virtual pairs acting in various orders 
do exist (although the philosophical ideas used to describe them 
may someday be altered drastically, of course). 


The analogous Lamb shift in other states, such as 35,;2 — 3p,/2, 
etc. have also been measured 7) and agree in a satisfying way with 
theory, although the test here is not quite as stringent as for the 
25;;2 — 2p\;2 because of the somewhat larger experimental error. 


The 2s)/2— 2p3;2 separation is also measured so, as a byproduct, 
we have a measurement 29 of the fine structure separation 


2p3/2 — 2pys2 in deuterium. The formula for this separation 
is (20) 18) 12) 


16 


This formula can be derived by the usual complete Q.E.D. analysis, 
but its terms are easy to understand. The first is the energy of inter- 
action of the electron’s magnetic moment with the nucleus revolving 
about it, considering the electron at rest. It involves the anomalous 
magnetic moment of the electron, calculated “3) to be 

Ser = 2[1 + a/2x — 0.328 (a/x2)2] = 2 (1.00115961) (4) 
and a factor m/yp = 1 + m/M, where M is the deuteron mass, to 
correctly represent the velocity, relative to the electron, of the nucleus 
generating the magnetic field. The factor in front of the [] is 
obtained by averaging the interaction over the Schrodinger wave 
function for the p state. The next term, — 1, is the Thomas preces- 
sion correction. The relativistic correction to these two terms 


1 3 5 3 
A = 7gRvd,¢(T) Ulga — 1) + 32 — 25 n(l/a) + 1 - 


: ‘ 2200 . ; 
combined is to order «2 just 8 a2, in accordance with the fine struc- 


ture formula of the Dirac theory of hydrogen. Because the electron 
can emit and absorb virtual photons its effective location is smeared 
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Over a range (the main source of the Lamb shift of the 2s state), 
so the fine structure interaction is corrected in order «3. The 
biggest part of this is a term in /n « whose coefficient is easily under- 
stood, but a more complete evaluation of the «3 term, up to the 
constant “ ?’’, has not been carried out. Since the «3 /n a term 
only amounts to one ppm (part per million) and the experimental 
result 


A = 10971.58 + .20 Mc 


is available only to 20 ppm (limit of error) there is no great need 
to evaluate the «3 term more completely than at present. Aside 
from the small terms «3 Jn «, if an experimental value for ge: is used, 
this formula contains no subtle virtual state effects of Q.E.D. that 
cannot be understood from the Dirac equation and semi-classical 
arguments. It has been used to obtain a value for «, or as one of 
the equations in a general evaluation of the fundamental constants. 


Anomalous magnetic moment of electron and muon. 


From the emission of virtual photons the predicted value 2 for 
the gyromagnetic ratio of the electron is altered to the expression (4) 
valid to order «2. The first term is from one virtual photon. The 
second term contains two effects: (a) the effect of two virtual 
photons, and (6) the vacuum polarization correction to the pro- 
pagator of the first virtual photon. They have been calculated by 
Peterman. For the muon, supposing it to satisfy the Dirac equation 
but with a different mass, all the terms are, of course, the same 
except term (6). In the vacuum polarization for the muon there 
are terms for virtual electron pairs as well as muon pairs, and the 
predicted 21) g value for the muon is 


g =2[1 + «/2z + 0.75 (a/x)2] = 2 (1.001165) (5) 
A value for ge: has been obtained by Hardy and Purcell by 
combining a measurement of Gardner and Purcell (see reference (4?) 
of the cyclotron frequency of the electron to a magnetic moment 
measurement of Beringer and Heald 22) to get 
Zer = 2(1.0011552 + 8). 
However, an independent measurement of the cyclotron frequency 
by Franken and Liebes 23) gave somewhat different results, and 
leads to a g value of 
Zer = 2(1.001168 + 7). 
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A direct measurement of Schupp, Pidd and Crane 4) gives 
Ber = 2(1.0011609 + 24). 


These results are in fair agreement with the theoretical result (4). 
The result of Schupp ef a/. implies that the coefficient of the «2/7? 
is —0.1 + 0.4. 

The theoretical result should probably be accurate to one part 
in about 10~8 guessing that the next term in the series is roughly 
+ (a/x)3. If the photon propagator is modified as (3), the correc- 


; : a { m \2 
tion to g/2 is — — {=} sO an agreement to | ppm means only 


3x\ A 
that A exceeds 15 Mev. More information comes from the measure- 
ment of g,,—-2 for the muon. The experimental result 25) for Su 


is 2(1.001145 + 22) agreeing with the theory within its error of 
22 ppm. In this case a propagator like (3) would correct g/2 by 
a Jmu\2 

35 | | where My 


not to exceed 22 ppm, A must exceed 630 Mev. This is therefore 
at least as good a test as is provided by the proton-scattering exper- 
iments. It is remarkable in that it tests at the same time that the 
heretofore unfamiliar particle, muon, satisfies the Dirac equation 
with no appreciable structure comparable to its own Compton 
wave length. 


is the meson mass of 105 Mev. If this is 


Hyperfine interaction. 


The hyperfine splitting in the ground state of hydrogen resulting 
from interaction of the nuclear moment and the electron has been 
measured 26) very accurately for the three isotopes of H, and for 
the He} ion. The theoretical formula 27) for this splitting is 


_ 16a2¢ ep Bel\|  \ Bee ayo , am 
Ay= Ryd,,| me (=) [1 + 52a — G — n2)Za2— x7 
(6) 


where tp, Yer are the magnetic moments of proton and electron 
and wo is the Bohr magneton, p/m = (1 + a) “1! where M is the 
mass of the nucleus. The terms in front of the bracket gives the 


value expect from a non-relativistic analysis, given by Fermi. In 
3 : : ; ; 
the bracket the 5 (Za)? is a Breit correction resulting from the use 


of the Dirac equation instead of the Schrodinger equation. The 
next term is a correction from virtual photons in Q.E.D. 


The last term is a correction for recoil and finite size of the 
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nucleus. A direct calculation assuming a point charge and dipole, 
leads to a result which diverges logarithmically 28), It is sensitive 
to the electromagnetic structure of the nucleus. It arises from the 
exchange of ¢wo virtual photons with the proton (in H). If the 
amplitude for this exchange at high energy were known, the term 
could be evaluated. Data could come from the forward spin flip 
Compton scattering from a proton. It cannot come directly from 
proton-electron scattering experiments for these only give the form 
factor for a one photon interaction with the proton. Assuming that 
each of the two interactions has the same form factor as does a 
single photon, C. Iddings and P. Platzmann, P. R., 113, 192 (1959) 
evaluated X as 8.7, corresponding to a correction of — 35 ppm. 
However, I think uncertainties in the assumption relating one and 
two photon structure factors, as well as uncertainties of the one 
photon structure factor itself for high energy may make the error 
in X as high as —2 or + 10 ppm. [There is a relative insensitivity 
to these assumptions because a major part of X involves the large 
In(M/m).] 


Unfortunately, therefore, we cannot use these high precision 
measurements directly to test Q.E.D. independent of our un- 
certainties in the electrical properties of the nucleus *. Never- 
theless, there may be a discrepancy here for the measurement of 
Lambe and Dicke 29) corrected by 27.5 ppm for a diamagnetic 
correction gives, with 1/« = 137.0389 + .0006, a result for the X 
term of + 0.7 + 8.8 ppm instead of — 35 ppm, but terms in (6) 
of order «3 may be unusually large **. 


* The hyperfine splitting for deuterium is still more dependent upon the 
nuclear structure, this time of the deuteron. The ratio vp/v,, divided by the 
ratio of the magnetic moment of D and H and by the reduced mass factors 
cubed is one minus 1.703 x 10-4 experimentally. An attempt to estimate this 
from nuclear theory by Low and Salpeter, P. R., 83, 478 (1951) gave 
1.98 + .20 x 10-4, but this calculation could probably be improved today. 


** NOTE ADDED IN PROOF : 


D.E. Zwanziger has just informed me of calculations of corrections of order 
a3(/na)2 and «3 /na to the hyperfine separation in hydrogen made by him and 
A.J. Layzer independently. They find a contribution of — 9 ppm so the total 
predicted term is — 44ppm. Thus, a real discrepency to the ‘ measured ” 
+ 0.7 + 8.8 ppm seems to be developing here. However, the trouble might 
lie instead in the measurement of the fine structure separation in deuterium, 
for E. Richard Cohen has kindly informed me that if this fine-structure measure- 
ment is omitted from a least-square reduction of the fundamental constants 
the value of 1/ is 137.0417 + .0025. The “‘ measured ”’ value of the hyperfine 
separation term would then be — 40 + 36 ppm (instead of + 0.7) which would 
be consonant with the theoretical value. 
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Zwanziger has noted that these uncertainties disappear if one 
compares, in the same atom, the hyperfine structure in the 2s state 
to that in the ls state. Measurements of the hyperfine shift in the 
2s state have been made by Heberle, Reich and Kusch, P. R., 101, 
612 (1956), 104, 1585 (1956). According to the Fermi formula the 
ratio should, in first approximation, be as the probability of finding 
the electron at the nucleus, (1/8), but the actual measurements lead 
to a result 


8 V25/¥j5 = 1 + d where d = 34.6 + 0.3 ppm for H 
= 34.2 + 0.6ppm for D (7) 


The formula (6) is not adequate to calculate d to this accuracy, 
for the expression in brackets has not been carried to a high enough 
order. It might be expected that if divergences arise already in this 
order they should be still worse for higher order, but Zwanziger 
shows that this is not true if the ratio v 5/5 is calculated. The 
term d has several contributions. The electron magnetic moment is 
spread by vacuum polarization, and by the form factor in the theory 
of this moment. In addtion the interaction of the electron with 
the nucleus is altered because the wave function of the electron is 
altered. This is because, just as in the usual Lamb shift, the electron 
sees a modified potential since it emits and readsorbs photons 
[giving a term in In(m/Ryd)] and further, the potential is actually 
modified by the vacuum polarization. This calculation is similar 
to the first order Lamb calculation. Terms of order «2m/M have 
been calculated by Schwartz (see reference 27), The theoretical 
result for d is 


d = 34.5 + 0.2 ppm 
which agrees excellently with the experiments (7). The Breit term 


alone gives ; «2 or 33.3 ppm so that we do not here have any sharp 


test of Q.E.D. at short distances. But it does confirm our general 
ideas and checks again, but less accurately, that there are no small 
deviations at larger distances. 


There is a measurement of the hyperfine structure of the meta- 
stable triplet state of He3 by White e¢ al., P.R.L., 3, 428. If this 
is compared to the He3 ion hyperfine separation, the dependence 
on nuclear structure cancels out. For the ratio of frequencies they 
get 6.2211384 +12. To calculate this it is necessary to know the 
wave function for the triplet state to find the probability (times 6) 
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that one electron is at the nucleus. The non-relativistic theory with 
the best known wave function gives 6.222030 for this. Relativistic 
corrections reduce this to the theoretical prediction 6.221199 + 6 
for the frequency ratio. The remaining deviation is likely to be a 
slight inadequacy (only 10 ppm) of the variational wave functions. 
There is no evidence that Q.E.D. is any less adequate in handling 
systems with two electrons than it is for one. 


Positronium. 


The ‘‘ atom ’’ formed from an electron and a positron presents 
none of the uncertainties of nucleon structure that the hydrogen 
atom does. It is therefore an interesting object for calculation, 
although the experiments are much more difficult because of its 
transient nature. A still better object would be muonium but the 
experiments here have not yet been performed. 


Just after the newer methods of Q.E.D. were developed, since they 
were so readily applied to perturbation theory of free systems, it 
was supposed by many that bound state problems presented some 
special difficulty. That this was not so was noted by W.E. Lamb, 
P. R., 85, 259 (1952) and E.E. Salpeter, 87, 328 (1952). Evidently 
one cannot analyze the bound state by starting a perturbation series 
from non-interacting particles. But one very effective way is to 
use, aS a starting point, the system held together by instantaneous 
Coulomb potentials. This system can be analyzed by an ordinary 
differential equation in time, like the Schrodinger equation, because 
of the instantaneous nature of the interaction. The perturbation 
then consists of adding the effect of virtual transverse photons in 
various orders. Of course, any other unperturbed system, held 
together by some approximation to the true interaction, will serve 
as well; the perturbation being the difference between the true inter- 
action of Q.E.D. and the approximate interaction assumed. The 
instantaneous Coulomb potential is a good starting point because 
its initial approximation is so good. 


The most complete analysis of the hydrogen-like atom with 
arbitrary mass ratio of the two charges has been given by T. Fulton 
and P. Martin “8), where references to earlier work will be found. 
They have used their equations to compute the energies up to the 
first order Lamb effect (i.c., to order «3 Ryd) of many states in 
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positronium. The most delicate test is the separation between the 
singlet and triplet Is states of positronium measured by Weinstein, 
Deutsch and Brown, P. R., 98, 223 (1955) to be 2.0338 +4 x 105 Me. 


The theoretical value 29) for this shift is 
2 1 116 a 
2 es egy ie ip = 5 
a2Ryd_¢ [5 +5 (5 sie mn2) =) 2.0337 x 105 Me 


the first term represents the first order interaction of the spins and 
the second the amplitude for virtual annihilation into a photon 
with re-creation of the positron again. It is clear experimentally 
that such a term exists, once again confirming our general view of 
what virtual processes go on. The last term is the first order Lamb 
correction for this system; it amounts to 0.0100 Mc. 


General conclusions. 


There are many other calculations and experiments in which 
some aspect of Q.E.D. is involved (such as vacuum polarization 
effects in mu-mesic atoms, or relativistic corrections to the computed 
helium ground state energy, etc.). We shall not go on to describe 
them for, although confirming Q.E.D., they do not provide sharper 
tests than the examples already given. 


All this may be summarized by saying that no error in the predic- 
tions of quantum electrodynamics has yet been found. The 
contributions expected from the various virtual processes envisaged 
have been found again and again, and there is very little doubt 
that in the low energy region, at least, our methods of calculation 
seem adequate today. The region of energy (of virtual states) that 
has not yet been explored even for gross errors exceeds 600 Mev 
(the Compton wave length corresponding to this is 2x times 
3 x 10-44cm). There are no experimental indications that the 
laws of Q.E.D. cannot be exact. Are there any theoretical reasons 
to expect a failure ? I will discuss such questions in the next few 
sections. 


Before we do this I should like to make a remark on the character 
of these calculations. It seems that very little physical intuition 
has yet been developed in this subject. In nearly every case we are 
reduced to computing exactly the coefficient of some specific term. 
We have no way to get a general idea of the result to be expected. 
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To make my view clearer, consider, for example, the anomalous 
electron moment given in (4). We have no physical picture by 
which we can easily see that the correction is roughly «/2z, in fact, 
we do not even know why the sign is positive (other than by 
computing it). In another field we would not be content with the 
calculation of the second order term to three significant figures 
without enough understanding to get a rational estimate of the 
order of magnitude of the third. We have been computing terms 
like a blind man exploring a new room, but soon we must develop 
some concept of this room as a whole, and to have some general 
idea of what is contained in it. As a specific challenge, is there any 
method of computing the anamalous moment of the electron which, 
on first rough approximation, gives a fair approximation to the « 
term and a crude one to «2; and when improved, increases the 
accuracy of the «? term, yielding a rough estimate to «3 and 
beyond ? 


THEORETICAL QUESTIONS 


Self-energy. 


The first difficulty which arises if one assumes that each of the 
transverse electromagnetic modes in a box is a quantized oscillator 
is that each should have a zero point energy w/2. Since there are 
an infinite number of modes this zero point energy is infinite. It is 
easy to get around this difficulty, however, by supposing that 
absolute energy cannot be measured (leaving a question for the 
theory of gravitation) so that all the zero point energy is subtracted. 
But now suppose we put atoms or other objects in the box at a 
small density N per unit volume, so that the index of refraction 
is changed from 1 to 1 + 2x Nf{/w? where fY is the real part of 
the forward scattering amplitude of the object for light of mode K. 
The wave lengths which fit into the box are still the same, but the 
frequency of the modes is changed by 27 Nf{/w so the total zero 
point energy is changed, per object, by 
1 Ar fe 
A= 5 UK Jax 
This shift in energy we would associate with the object and would 
call it the self-energy of the object. There are higher terms from 
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the effect of scattering two photons at once, but we shall just go 
to the first order in a. In addition, we have the positron-electron 
Dirac field and will have to add a contribution for the shift in 
energy Of the electrons in the negative energy sea. This will involve 
SX the amplitude for the object to scatter forward a positron 
in state N. It is better to deal with electrons and positrons symme- 
trically and we find the following formula for the self-energy of 
an object : 


rf pos 
Sk K 1 fan 1 N 
4x) 1 AE = = > IN 8 
(4x) TAB = 53x FE +5 t ox  ® 
where f“ is the real part of the amplitude for the object to scatter 
forward an electron in state n, f° that for scattering a positron 
in state N, and f Yt that for scattering a photon in state K (K specify- 
ing momentum and polarization), everything to first order in « *. 


Applied to a free electron, however, (8) still gives a divergent 
result. It might be thought that this could also be subtracted away, 
and only differences taken for the electron in different states i, but 
these differences are also infinite. This is most easily seen if one 
compares the energy of two photons with the energy of the pair 
one expects to create from them. No completely satisfactory way 
has been found out of these difficulties. 


Electron mass. 


In making actual calculations, one way to handle the difficulty 
is this: temporarily stop the divergent integrals at some high 
energy and note that the self-energy effect, at least insofar as it 
depends (logarithmically) on the cut-off is equivalent to changing 
the mass of the electron from mp to mo + Am in every process (of 
energy well below the cut-off). If we write m = mo + Am, interpret 
it as the experimentally observed mass, and write all results in 


* To be more explicit if the object is an H atom with an electron in state j 
this formula gives the correct level shift to order « if n and N are states in the 
nuclear potential Z, but the interaction of the electron in i and the electron n 
is calculated only to first order in «. That this is true can be easily demonstrated 
by writing out each amplitude by diagrams, adding the results, and comparing 
with the unusual diagram for the virtual photon level shift. Actually, only the 
exchange scattering in f* and the annihilation scattering in f ?°* need be taken. 
The rest cancels out. If the object is charged, like a free electron, there is no 
true forward scattering f* as it goes, by the Rutherford law, as 8-2 but f 7% 
does likewise and, in (8), they cancel out to a finite limit as 86> 0. 


cai 


terms of m, then the limit can be taken as the cut-off energy goes 
to infinity. Expressing everything in terms of m is called renormal- 
izing the mass, and a theory for which results are then independent 
of the cut-off as it goes to infinity are called renormalizable. Q.E.D. 
is renormalizable if the electron mass, and the charge, are both 
renormalized (it is, I think, not necessary to renormalize the mass 
ratio of muon and electron). 


What is the meaning of this ability to renormalize the mass ? 
From one point of view it is no problem at all. After all, only m 
is observable, not mo so the mp, is a construct which should be got 
rid of anyhow. Only the renormalized theory should have been 
written in the first place. Two questions arise, however. 


The first question is whether the renormalized theory is, in fact, 
a logically consistent theory. With any finite cut-off the theory is 
not consistent, slight deviations from unitarity (the principle that 
the sum of probabilities for all alternatives should be unity) occur. 
These get smaller as the cut-off energy A goes to infinity. But the 
mass m, that may be needed to get a finite m for very large A may 
be negative. That is, the theory may contain hidden difficulties if 
we computed processes for energies E such that « /n(E/m) = 1. 
This is such an extreme energy that such matters are of no apparent 
concern to calculation of lower energy phenomena. However, 
from a strictly theoretical view it would be nice to know whether 
renormalized Q.E.D. is a consistent theory, or whether difficulties 
may not arise of relative magnitude e—37. The great difficulty in 
answering such questions is our limited mathematical ability to 
deal with situations where some kind of perturbation theory does 
not suffice. I do not know if it has even ever been proved that Am 
still diverges if all orders of perturbation theory are included. 
[The perturbation result for Am is mo (3a/2x)In A/mo]. 


The second question concerning the renormalization idea is that 
renormalization of a quantity A gives up any possibility of calcul- 
ating that quantity. Now it may be that the “ electromagnetic part 
of the electron mass ’’ is unobservable, but this is not true of other 
particles. The difference in mass of proton and neutron, or of z+ 
and x°, or of K+ and K4, etc. are almost certainly electromagnetic 
in origin. They cannot be computed with a renormalized theory, 
for in such a theory any constant can be added to the masses of 
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each particle. It should be objected that these baryons and mesons 
are complicated systems, in virtue of the strong interactions, and 
we therefore do not know the correct laws of coupling to calculate 
this mass difference. If we knew them, perhaps the mass differences 
would converge without any modification of Q.E.D. itself. This 
is indeed possible, in principle, but so far it has not been demon- 
strated to be true in fact. Any complete field theory yet written 
for strongly interacting particles and electromagnetic interactions, 
has always appeared to be unrenormalizable unless these mass 
differences are renormalized as well. It seems odd to try to put 
the complete burden of the divergences of Q.E.D. on special pro- 
perties of the strong couplings, but on the other hand, that is where 
they may indeed lie. At any rate, close study of these mass differ- 
ences would probably teach something; either of the breakdown of 
Q.E.D. or of the electromagnetic characteristics of the particles. 


If it is assumed that the nucleon and meson structure is not solely 
responsible for the convergence of the mass differences, and that it 
is a failure of Q.E.D. instead, then the numerical values of these 
differences suggest that the failure of Q.E.D. should begin to show 
up strongly at virtual energies around 1 Gev. 


Finally, we should remark on the possibility that all of the mass 
of the electron is electromagnetic in origin. First, of course, the 
correction seems to be too small to do that (yet how can something 
that is infinite appear to be too small ?). But disregarding that, 
there is the argument that, if the electron mass is zero, the change 
of the electron field operator from ¥ to y5 will not alter things. 
Put otherwise, a state of right helicity will never be converted to 
one Of left helicity, no matter how often it interacts with real or 
virtual photons. But an electron with mass does not have this 
property, so it has been believed that mass cannot come from no 
mass. But recently several physicists (Heisenberg, Nambu, 
Schwinger, for example) have argued that this is, in fact, not true. 
In a ferromagnet the original system has the symmetry that all 
space directions are equal; but in fact the interaction can produce 
a polarized background state in which any excited state has an 
energy depending on its alignment to an axis. That is, if we go 
beyond perturbation theory such symmetry arguments may fail. 


On the other hand, pure Q.E.D. with only zero mass electrons 
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and photons interacting with no other particles and having no cut- 
off energy probably cannot produce a finite electron mass. This is 
because the system is also invariant to a change of scale, there is no 
parameter to determine a length. Yet an electron with a mass 
involves such a length. [ am not certain, but it appears to me impos- 
sible to generate a specific length from no scale whatsoever. 


However, in fact, there are two places in Nature from which 
such a length could come. One is the theory of gravitation, the 
gravitational constant involves length dimensions and light inter- 
acts with gravitons. Further, Machs’ principle in quantum mechanics 
is equivalent to the statement that the surrounding nebulae deter- 
mine the atomic scale of length in a local vicinity. However, these 
theories are not developed far enough for us to compute the electron 
mass starting only with zero mass electron, photons and gravitons 
in interaction. 


A more practical point to notice is that, in fact, photons do 
interact with nucleons and mesons, and to allow that there is, in 
that system, an independent scale of length, say the nucleon mass. 
This would serve as the small length-determining perturbation, 
which works its way back to determine the mass of the electron. 
It is always possible that the equation determining the electron mass 
has more than one solution, and that a second solution is the muon, 
but we are engaged in pure speculation here. 


One way to find out about this is to study more seriously Q.E.D. 
with electron mass exactly zero. On the one hand, it may be useful 
for getting a better understanding of Q.E.D. when electron energies 
are high, but most particularly, it would be interesting to see if 
such a theory is consistent at all. For example, a charge entering 
a magnetic field presents problems; it seems to radiate at an infinite 
rate. Perhaps a careful study of this problem, and the effect of a 
small length-determining perturbation on the result, would lead us 
to an understanding of the ratio of electron mass to nucleon mass. 


Electric charge. 


Q.E.D. contains two constants which must be determined by 
experiment. One is the electron mass, which we have just discussed. 
The second is the electric charge, or the dimensionless combination 
a = hc/e2 = 137.039. For both of these the renormalization process 
must be applied, so we have foregone computing a also. 
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It is interesting that all the ,, fundamental ,, particles which are 
charged have the same charge, but we have no remarks to make 
on that point. 

The way that the charge becomes renormalized is via the polar- 
ization of the vacuum. A virtual pair produced by a photon (also 
virtual) annihilates again to re-create the photon. In first approxim- 
ation the correction to a from virtual electrons is Aa = (2/37) /n(A/m). 
That is to say, it is divergent, so we have this time to cut-off the 
electron propagator at some energy A (this is in addition and 
different than the photon propagator cut-off discussed above — but 
they may ultimately have the same origin). Virtual pairs of other 
particles such as muons simply add their contribution to Aa in 
first order. 


At first it may be argued that here, at least, the philosophy of 
renormalization is unassailable. The “ free charge ’’ must be un- 
observable (although Gell-Mann has suggested that this may not 
be so, but high energy interactions may determine it). Only the 
total corrected a can be measured. Unlike the case of mass where 
we have charged and uncharged particles, like neutron and proton, 
to compare, here we have nothing but the single measured a. 

This is true, but are you willing to give up, forever, the possibility 
of computing this remarkable constant a? If in some ultimate 
future theory a is to be computed, will we find ourselves correcting 
some value ay for virtual pairs ? One can hardly begin to speculate 
from our present position, but the question of how a can be computed 
at all has always been intriguing, and I should like to make a few 
remarks about it. 


The quantum theory of electromagnetic interaction can be formul- 
ated roughly as follows. Let S, be the S-matrix operator for the 
system of particles not interacting with the electromagnetic field, 
letj,, be the current density operator of this matter omitting 
the charge factor, and A,, be the electromagnetic potential operator 
times the charge. Then the S-matrix including the field is some- 
thing like (I am merely outlining here, the precise definitions are 
assumed to be familiar) 


S = Sp P f jyAyat ol L,(A) (9) 
where 
a 
L(A) = me I Fav Fas (10) 
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do = fic/eo? is the unrenormalized a and Fy, = Ay — dx Ay. 
Note that we are using somewhat unfamiliar notation since neither 
J, nor A, require knowledge of the charge in their definition, it 
appears only in Lo. 


Now if we take expectation values between states of the vacuum 
for matter (to get at the vacuum polarization and other effects of 
virtual pairs) we shall have to calculate the expectation 
(fj, A& iL(A 
25,01 af) (11) 
it being sufficient to consider A,, as a c-number function and L(A) 
defined here is simply a functional of Ay. The completion of a 
problem requires that we integrate 


over all potential functions Au satisfying the boundary conditions 
of the problems. 


In a way of speaking, then, if we were not aware of the pairs 
we would say that electrodynamics has the effective Lagrangian 
L(A) + L(A). 


The L(A) defined in (11) can be expanded in powers of Ay and 
in powers of its derivatives assuming in some Situation that A is 
smoothly varying and small. A constant A, has no effect, assuming 
Ou Jy = 0, the conservation of charge, or guage invariance, so the 
expansion begins : 


L(A) = ¢f Fuy Fuy de + cf Fyy, p24 + cS (Fy, de +... (13) 


where, except for the first term, only the type of term is meant to 
be indicated, two F’s and two derivatives in the second term, four 
F’s in the third, etc. The c, c’, c” ... are constants. The first term 


F : : a é 
can be combined with L, in (12) to form g J Fu Fay dv with 
a, =a) + 8xc and this is the origin of part 394 of the charge 
renormalization. For electron pairs c = (1/1272) n(A/m) (as we 


have said), which is very small (although infinite !) compared to a. 


The other terms in the effective Lagrangian L, +L generate 
modifications of Coulomb’s law, the scattering of Jight by light, 
etc. Had these phenomena been discovered before Q.E.D., they 
would have been representable in classical theory as just such 
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modifications of Maxwell’s Lagrangian, so that the complete 
Lagrangian would be considered to be a complicated, non-linear 
and partly unknown affair. With the advent of Q.E.D. explaining 
the origin of the non-linearities, etc. the natural reaction would be 
to try to explain the complete Lagrangian in this way. The fact that 
c is infinite, although discouragingly small, would be exciting. It is, 
in fact, possible that c is not so small; perhaps the cut-off is 
controlled by gravitation, or, perhaps, some other modification of 
the laws of matter replaces the logarithm with a less convergent 
expression, or the cut-off may not be necessary if perturbation 
theory could be avoided. Finally, every charged particle makes a 
nearly equal contribution (although not to c’ or ce” which vary 
inversely as the mass squared, or fourth power). So we must add 
contributions from muons, nucleons and mesons. We do not know 
how many nor how much each contributes and, although difficult, 
it is not impossible that a complete future calculation would give 
a value near 137 to 8c so that no dp is required at all ! 


If that is the case, quantum electrodynamics takes a very simple 
form. In (9) Lo(A) must be omitted. Then the functional integral on 
Aw which must be taken, gives, since fexp. @[jyAyd)DA, = Stiul 
a functional delta function of j,,. It says therefore simply that all 
amplitude and expectations of So must be taken subject to the condition 
that the total current density is everywhere and always zero. 


What is electromagnetic interaction ? If a real charge exists it 
must generate its own four-current (charge density, if at rest). Since 
total current vanishes this must be compensated by an opposite 
current in the vacuum sea of charged particles. Because of the 
dynamics of these particles the compensation cannot occur just 
locally, but another counter current is generated nearby, compen- 
sated by vacuum current again, etc. until this effect is propagated 
out to infinity. The energy associated with these compensating 
currents is the electromagnetic self-energy of the charge. Another 
charge placed in the vicinity adds its system of compensating currents 
so there is an energy of interaction between these charges. At any 
rate something like that is implied by these unsupported speculations. 


Interaction of other particles. 


Although Q.E.D. is very accurate, there is of course one way in 
which it must be wrong; it is incomplete. There are not only 
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photons, electrons, and muons in the world, but charged baryons 
and mesons as well. It will not do to say that Q.E.D. is exactly 
right as it stands in a limited situation where only electrons and 
muons are present, because virtual states of charged baryons must 
have an influence. Two sufficiently energetic photons, colliding, 
will not do just what Q.E.D. in that limited sense supposes; they 
also produce pions. Or, more subtly, a sufficiently accurate analysis 
of the energy levels of positronium would fail, for the vacuum 
polarization from mesons and nucleons would have been omitted. 


On the other hand, we use that theory as best we can to discuss 
the Coulomb potential from the nucleus, the nuclear emission of 
y-rays, the Bremstrahlung expected from pions, the chance that 
y-rays will be found in the disintegration of the K+, etc. How do 
we do it ? 


We expect that the principles of Q.E.D. will extend to these 
particles too, and replace our incomplete knowledge of them by a 
set of constants (charge, magnetic moment, etc.) which will suffice 
for low energy analysis. 


Yet we use the tool of electrodynamics for much more than that. 
We make some hypothesis about how the photons are “ ultimately ”’ 
coupled to the new particles. We suppose for example, that the 
anamolous magnetic moment of the proton has its ultimate origin, 


not in an extra Pauli term Ri Oy F wv in the “ original Lagrangian ”’ 
but in the currents of virtual mesons which surround the nucleon. 
That is, we assume that the coupling is in some sense ultimately 
as simple as possible, and all apparent anamolous effects have their 
origin in complexities of the strongly interacting particles themselves. 
This -hypothesis is universally used, permitting us to use electro- 
magnetic interaction to learn something about the strange particles. 
Yet it has never been formulated in a completely precise manner. 
Its importance was first emphasized by Gell-Mann who called it 
the principle of minimal electromagnetic interaction, but following 
a suggestion of Telegdi, I shall call it Amperes-hypothesis (the 
assumption that all magnetism comes from currents). 


There is one exception already known to this principle; photons 
interact with the gravitational field without a charged intermediary. 
But this interaction can be viewed the other way about, that gravity 
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interacts with all energy, photon energy included. We shall there- 
fore disregard this counter-example for the present. 


The simplest suggestion for defining Amperes-hypothesis would 
be to say that in the ‘‘ fundamental Lagrangian ’’ (the exact form 
of which, at present, unknown) all gradient operators 2,, on charged 
fields are to be replaced by ou— Au and no other coupling to Au 
is to be assumed. There are two objections to this formulation. 
First, we do not know the form of the future theory; no Lagrangian 
may exist. Is there not some formulation closer to the observed 
properties of the particles ? The second objection is possibly 
academic; if the Lagrangian were before us, we would probably 
know exactly what to do. Yet a term like Suy Oy dy which is evidently 
zero could be added, but when 2d,, is changed to 3, —A,, it is no 
longer zero but is a Pauli term Suy F uv instead. Such an ambiguity 
would arise, say, if the Lagrangian contained second derivatives, 
and it was not clear whether to write CY 2p) OF 04, dy, at some point. 


One of the effects of terms like a Pauli moment is that certain 
processes in Q.E.D. become uncalculable. For example, if the muon 
carried a true Pauli moment the hyperfine split of muonium could 
not be computed, as the term “‘ X ”’ in (6) is divergent in that case. In 
other words, the theory would not be renormalizable. Perhaps 
then Amperes-hypothesis is equivalent to the assumption that the 
theory is renormalizable. On the other hand, the proton-neutron 
mass difference may not be one of those computable quantities. 
We must understand renormalizability, then, as the hypothesis that 
only a finite number of unknown quantities must be attached, before 
everything else can be computed. 


Another effect of a term like a Pauli moment is to drastically alter 
the behavior of cross-sections at high energy. From the dispersion 
point of view, discussed below, constants such as the charge, 
anamalous moments, etc. appear in the form of subtraction constants 
required to ensure the convergence of the dispersion integrals in- 
volved. More constants are needed if the high energy cross-sections 
remain large, or increase, with rising energy. Thus Amperes- 
hypothesis in this viewpoint would take the form of a statement 
that a certain minimum number of subtraction constants are required. 


What amounts to the same thing, but is more readily available to 
experiment is to try to replace Amperes-hypothesis by a statement 
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that the size of high energy photon cross-sections are limited in 
some specific way. 


As a last remark, if the current Ju comes from the Lagrangian 
so easily, might the structure of this current not tell us something 
about that Lagrangian (or whatever more fundamentally replaces 
it) ? This possibility is discussed at this conference by Gell-Mann. 


Dispersion theory 31), 


In the lowest order in which any process occurs, there are no 
integrals over virtual states (closed loops), and no divergences will 
arise. The Q.E.D. difficulties arise on integrating over invisible 
virtual processes. This generates a feeling that such virtual state 
integrations are unreal, or at least are not handled quite correctly, 
and that all of the formulas should be put in terms of directly 
measurable quantities. This can be done because of the analytic 
character of the functions involved. Their real parts can be expressed 
in terms of their imaginary parts. The imaginary parts can be 
expressed as the rate for real processes of lower order. Thus a 
diagram involving one virtual momentum integral can be expressed 
as a dispersion integral over a function determined from processes 
without a virtual integral at all. Put in this way, it sounds trivial, 
one integral is replaced by another, almost identical, and in fact 
from a practical calculational point of view there is often little to 
gain. But it is hoped that this viewpoint gives a clearer insight into 
the virtual state integrals and a closer relation to experiment. (The 
greatest utility of this method results, of course, in the analysis of 
strong coupling where the fundamental equations are unknown, 
for here one experimental result can be related to another.) 


We can illustrate by the simplest example, the second order 
vacuum polarization effect of electrons. The amplitude that a virtual 
photon of momentum g? makes a pair and annihilates again is 
written g2 f (q2), so that the entire dependence of L(A) on A expanded 
to second order is f f(q?) F uv) F uv(9) d4q written in momentum 
space. 


Now the imaginary part of g2f is the rate that a (virtual) photon 
makes real pairs. It only exists, writing q? = 4m2x for x > 1. 
Choose the time axis in the direction of q and the photon polariz- 
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ation in the z direction and we require the probability that a vector 
potential of frequency 2m Vx constant in space, produces a pair 
of electrons each of energy E = m V/x, of momentum p = m+/x — 1. 
This is one of the simplest elementary problems in Q.E.D. The 
amplitude for this is (uzy2), squared and summed on polariz- 
ations it is — sp [(p2+m) y2(pyt+m)y 2] = — (pa + pr— m) — 2prapiz 
‘ an m2 
or averaging over directions, E2 + 3 Pt+m= aaa (1 + 2x). 
The phase space factor is pE/(2E)? so, dividing by q?, we find for 
the imaginary part of f (times (4x«)~!) 


_ 142 jx— 1)? 
f= ax | x . 

The real part is now given by a dispersion integral from Cauchy’s 
theorem as 


fa(9?) = = J fig) dqi?/(q? — uy?) 
or 


1? 1+2yfy—1y/2 dy 
AM =2f SPS (14) 
The integral, however, is divergent (so the Cauchy theorem is not 
strictly true, there is a contribution from the contour at infinity). 
This can be handled in the following way. We can assume fg for 
some x is known experimentally or by definition. In this case f,(0) 
is the renormalization of the charge, so in the spirit of renormaliz- 
ation we can take it to be zero. Then we can use the Cauchy relation 
for the more convergent expression [ f(x) —/f(0)]/x. What it leads 
to is the same as if we subtract from (14) the same equation with 
x=0: 

hey norte 2 Pa as 

: : 1 3y ( y y(x—y) 
an integral which is now convergent and whose integrated value 
(2/3x) [(2x + 1) (1 — 8 etn 8) — x/3], where sin? 8 = x, is exactly 
the finite part of the vacuum polarization effect obtained by the 
usual integral over a closed loop 30), 


It is clear that Q.E.D. in its renormalized form may have its 
simplest expression in this mathematical scheme. We need merely 
say that f,(0) vanishes, for we wish to work with the constants 
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already renormalized. Several authors have discussed Q.E.D. from 
this point of view 32), 


If the integral (15) were still divergent we could perform another 
subtraction, but generate another uncalculable constant (analogous, 
in a different problem, to a Pauli anamalous moment). Amperes- 
hypothesis is that this is not necessary. 


It must be admitted, however, that no fundamental change in 
position on the renormalization question is really involved. If the 
integral (14) did converge we certainly would compute it, and call 
its value at x = 0 the change Aa in a, induced by pairs. In the 
usual loop integral method the integral is completed by subtracting 
the effect that the pairs would have if the mass of the electron were 
changed from m to A. The same method here makes (14) convergent 
and gives the same value for fp(0) = Aa = (2/37) In(A/m). 


It might be hoped, therefore, that such dispersion relations in- 
volving the entire system of strange particles and Q.E.D. may be a 
satisfactory way of representing nature. It has much to recommend 
it; its close relation to experiment, the possibility of interdetermining 
coupling constants, the avoidance of the possibly meaningless 
question of which particles are fundamental and which compound, 
etc. These points have been emphasized by Chew in a remarkable 
speech at the conference in La Jolla, California this year. One 
serious question appears, however. Integrals over all energies are 
still required and at high energies the real processes involve all 
kinds of particles in considerable numbers. Thus the set of inter- 
connected equations becomes enormously elaborate just when it 
becomes interesting. It is not clear how to get started grappling 
with this complexity. 


On the other hand, the experimentally observed extreme energy 
phenomena suggest that they may have certain regularities. If this 
is so, a central theoretical problem is to formulate these regular- 
ities 83), Only then may it be possible to close in an intelligent 
way the wide-open hierachy of dispersion relations. 


It is in the spirit that all quantities should be reexpressed in 
terms of others, in principle, observable that the formula (8) for 
the Lamb shift self-energy was developed. Dispersion theory will 
probably also permit its being simplified still further. The real 
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part of the forward photon scattering cross-section f,Y can, of 
course, be immediately expressed as an integral on the imaginary 
part, and a similar reduction can likely be made for the other 
terms. If this can be done the calculation of the Lamb shift for H 
(to first order in «, all orders in Z«) will only involve true rates of 
real photon absorption or of pair annihilation. Thus, for each 
state m a definite K value is associated (so wx, = En — Ej), so that 
the computation may be possible on machines. In fact, many of 
the matrix elements, useful in calculating internal conversion coef- 
ficients, etc. have been already calculated. 


The formula (8) is not valid directly for calculating such things 
as the proton-neutron mass difference, because other virtual fields, 
like the meson field, must be included. The necessary generaliz- 
ations of (8) can be written, but we shall have to see how useful 
they are and if the necessary experimental quantities are available. 


Conclusion. 


In writing this report on the present state of quantum electro- 
dynamics, I have been converted from a long-held strong prejudice 
that it must fail significantly (other than by simply being incomplete) 
at around 1 Gev virtual energy. The origin of this feeling was the 
belief that the mass of the electron (relative to the nucleon, say) 
and its charge, must be ultimately computable and that Q.E.D. 
must play some part in this future analysis. I still hold this belief, 
and do not subscribe to the philosophy of renormalization. But I 
now realize that there is much to be said for considering theoretically 
the possibility that Q.E.D. is exact, although incomplete. This 
assumption may be wrong, but it is precise and definite, and suggests 
many things to study theoretically, while the other negative assump- 
tion, (that it fails somehow) is not enough to suggest definite 
theoretical research. This is Wheeler’s principle of “ radical 
conservatism ”’, 


Things are, of course, quite the other way for experimental 
research. One should look very hard for an “ expected ”’ failure. 
I have probably been converted from my prejudice, that it must 
fail, just in time to be caught off base by an experiment next month 
showing that indeed it does. 
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Discussion des rapports de Heitler et Feynman 


W. Heisenberg. — From Heitler’s lecture we have learned that 
Lorentz-invariance and local interaction seem not to be compatible 
in the conventional quantum-theory. On the other hand Feynman 
explained to us that quantum electrodynamics gives very accurate 
results on a very wide range of phenomena. These facts suggest, 
as has often been discussed, that in Qu. E. Dyn. we might meet a 
situation very similar to the Lee model with local interaction. Let 
me specify this assumption somewhat further : it would mean that 
by the process of infinite renormalization we have unconsciously 
introduced “ ghost-states ’’ of very high energy, i.e. an indefinite 
metric in Hilbert space. If this was true it would easily explain why 
for all low energy phenomena Qu. E. Dyn. gives excellent results 
and is a perfectly “closed theory ’’. At higher energies in realistic 
physics some modifications of Qu. E. Dyn. will occur since there 
will be the possibility of creating pairs of nucleons, x-mesons, etc. 
It may in fact be that the ghost states could be identified to some 
extend with the baryons, since the norm of baryon-states may have 
opposite sign to the norm of the electron-states. (That would not 
in itself interfere with the unitarity of the S-matrix, since we have 
baryon and lepton-conservation.) At the same time the indefinite 
metric would explain why it has not been possible to formulate 
Qu. E. Dyn. without these divergences and limiting processes. 
Because if, like in the Lee-model, the renormalized operators com- 
mute or anti-commute everywhere for a given time, then these 
operators at a given time are not sufficient to define the complete 
Hilbert space and we would need the operators in some arbitrarily 
small but finite time interval in order to define it. This would be 
equivalent to an infinite renormalization. Quite generally, and 
independently of Qu. E. Dyn. it seems natural to assume that a 
local interaction will have the tendency to eliminate the $-functions 
on the light cone in the commutators and replace them by a minor 
singularity — which would be equivalent to an indefinite metric in 
Hilbert space. Certainly we have no general proof, that we can in 
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such a formalism always avoid the well known difficulties with the 
probability interpretation. On the other hand several cases are 
known in which an indefinite metric in Hilbert space is compatible 
with a unitary S-matrix. Therefore we should investigate this pos- 
sibility for reconciling local interaction and Lorentz-invariance, 
unless some better solution can be suggested. 


P.A.M. Dirac. — I would like to give briefly my point of view 
with regard to field theory. The foundation of atomic physics is 
the superposition principle, which says that states are of such a 
character that they can be added together to give other quantities 
of the same nature. The states must be pictured as embedded in 
space-time; so that if one is given a state, one can apply to it various 
operations of rotation and translation to get other states. These 
operations form a group, the inhomogeneous Lorentz group. It 
follows that the states provide a representation of the inhomogeneous 
Lorentz group. The problem of setting up a quantum theory thus 
becomes the problem of finding a certain representation of the 
inhomogeneous Lorentz group. 

One could attack the problem by looking for all the represent- 
ations of the inhomogeneous Lorentz group. This method was 
followed by Wigner in 1939. He expressed the representations in 
terms of the irreducible representations. The irreducible represent- 
ations correspond to particles by themselves. Particles in interaction 
also correspond to representations, but reducible ones. All the 
work that has been done on quantum field theory may be looked 
upon as attempts to set up a suitable representation of the inhomo- 
geneous Lorentz group corresponding to physical reality. The 
attempts fail because of the infinities and produce nothing of 
mathematical significance. 

I think it would be worth while to work on the problem from a 
more general mathematical basis, in which one does not necessarily 
build up the representation in terms of field quantities suggested 
by existing physical theories. The task of primary importance is to 
get the mathematical relations right. One can then afterwards look 
for the physical interpretation of the various quantities that enter 
into the mathematical scheme. 


W. Heitler. — I would have no objection to the use of indefinite 
metric provided it can be done without inconsistencies. When 
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Heisenberg suggests that quantum-electrodynamics is a “closed 
theory ’’, this surely can apply to electrons only and implies that 
the self mass remains infinite. This would not permit a treatment 
of the electrodynamics of bosons and would not permit a calculation 
of the mass differences. 


W. Heisenberg. —- For x-mesons and nucleons it is more reason- 
able to start not from Qu. E. Dyn., but from an entirely different 
scheme. 


A.S. Wightman. — I should like to comment on Heitler’s non- 
invariant theory. Such theories are of interest from a point of view 
quite different from that which Heitler considered. They can be 
studied for the light which they may throw on the theory without 
cut-off. For this purpose one must examine the dependence on the 
cut-off of the various quantities occuring in theory. Normally, one 
considers two cut-offs in this connection, an ultra violet cut-off 
and a box. In Heitler’s theory there is no box and the theory appears 
Euclidean invariant. This gives rise to phenomena which, I believe, 
could strongly affect Heitler’s conclusions, whatever the purpose 
for which the theory is studied. What I have in mind is the Haag 
theorem which says that in a Euclidean invariant theory in which 
there are canonical variables, the no particle state is necessarily 
Euclidean invariant. Now the only Euclidean invariant state which 
admits a reasonable physical interpretation is the physical vacuum. 
Since in any theory where there is non-trivial pair creation (as in 
Heitler’s) the no particle state is not stationary, there is no reason- 
able vacuum state. The only way out of this difficulty is to use one 
of the so-called strange representations of the commutation relations, 
but in that case the evaluation of the physical quantities of the 
theory will certainly require some better technique than perturbation 
theory. 


G. Kallén. — Perhaps I may be allowed to formulate what 
Wightman has just said in a slightly different way. In any ordinary 
field theory you have the interacting field A(x) and the asymptotic 
incoming field A“ (x). Further you have a Hamiltonian 


H(A) = Ho(A) + Hin(A). 
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Especially in the old times one often introduced states, mathematical 
states, aS eigenstates of H(A) 


H(A) |, math> = E, | 2, math>. 


In contradistinction to this we have the physical states which are 
eigenstates of the total Hamiltonian or, which is practically the 
same thing, Ho(A“) 4 H,(A) 


H(A) | n, phys.> = En | n, phys.>. 
Many times one tries to write expansions of the form 


|n, phys.> = 2 Can’ | n’, math.>. 
ni 

The so-called “‘ Haag theorem ”’ says that the transformation Cr,n‘ 
is indeed, very singular from the mathematical point of view. I agree 
that that is certainly so also for the theory Heitler discussed yester- 
day. However, I also believe that this fact is not really very serious. 
If one computes more physical quantities like scattering amplitudes 
or even self-masses, they can very well exist even if Cyn’ is singular. 
Therefore, I believe that this particular argument against the Heitler 
model is not very relevant. 


A.S. Wightman. — The problem stated by Professor Dirac can 
be regarded as half-solved. I believe that we know up to unitary 
equivalence the reducible unitary representation of the inhomo- 
geneous Lorentz group which belongs to a physical theory with 
given stable particles. It may be displayed as the representation of 
the free field theory of particles of the same masses. 

To go further one must specify the physical observables of the 
theory. For the theory of a scalar field, for example, one has 


U(a, A) ®(x) U(a, A)-! = O(Ax +a), 

[®(x), ®)] =0 (x2 — y?) spacelike. 
Here U(a, A) is the unitary representation of the Lorentz group. 
I believe that the problem of finding the ® with these properties is 


a well posed one mathematically and the solutions would give a 
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natural expression for the basic ideas of field theory put forward 
thirty some years ago by Dirac, Heisenberg and Pauli. Unfortunately, 
it is as yet unsolved. Until we understand its solutions or lack of 
them I feel there will be no physical paradox in the foundations of 
field theory, just a muddle, 


L. Van Hove. — I would like to ask a question to Heisenberg 
concerning his remarks at the opening of this discussion. If, as you 
suggest, the baryons regularize the leptons and vice versa, don’t 
you expect that the vacuum polarization effects of the baryons which 
become important in high energy electrodynamics may come out 
to be different, for example in sign, from what conventional theory 
predicts ? 


W. Heisenberg. — If the baryons have opposite norm to that 
of the leptons, this should certainly have some influence on vacuum 
polarization effects of the baryons. Whether it would change the 
sign of these effects could probably be answered only by a careful 
investigation; at least I don’t know the answer. 


G. Chew. — If none of the strongly interacting particles is 
elementary, there should be no divergences in calculating electro- 
magnetic mass splittings of isotopic multiplets, even with existing 
rules of electrodynamics. (Think, for example, of the Coulomb 
splittings of He? and H3.) The principle that none of the strongly 
interacting particles is elementary is a feature of Heisenberg’s theory 
and can be incorporated into the S-matrix theory — even though 
the notion is awkward in conventional (Lagrangian) field theory. 
One may hope therefore, not to need cut-offs when such mass 
calculations are finally carried out. 


W. Heisenberg. — With regard to Wightman’s remark, I would 
like to emphasize, that in my opinion wellknown difficulties of 
divergences, etc. are not primarily mathematical problems. Quantum 
field theory is in two respects essentially different in its physical 
content from quantum mechanics : 

(1) In field theory the interactions are local while in quantum 
mechanics they are non local. 

(2) In field theory we have three boundary conditions while in 
quantum mechanics we have only two (at infinitely small and in- 
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finitely large distances of particles). The third boundary condition 
in field theory refers to an infinite number of particles. 


It is a problem of physics and not only of mathematics to see 
how such a profound change as the replacement of non local inter- 
actions by local ones will affect the mathematical representation. 
We cannot expect that such a change could be represented without 
radical changes in the formalism of quantum theory. 


With respect to the problem of the third boundary we should 
try to get some information from the experiments on multiple 
production of particles. 


Concerning the views expressed by Chew, I think I can agree in 
principle with most of his points. It perhaps should be possible 
in principle to construct the S-matrix simply by considering the 
group structure of the system of elementary particles and adding 
the postulate of unitarity and analyticity (except at points represent- 
ing physical states) in order to represent causality. All this could 
probably be done without the use of an indefinite metric. On the 
other hand I cannot see how from a practical point of view one 
could deal with the enormous complexity of the analytic behaviour 
of S-matrix elements, without deriving them from some kind of 
‘local interaction’’. One should also keep in mind that by the 
postulate of analyticity one goes already away from the energy-shell 
into the more “local ’’ regions; and discussing these regions with- 
out indefinite metric may be just as complicated as for instance a 
discussion of the fundamental laws of algebra without introducing 
4/ —1. But aside from these practical points I would approve of 
the views expressed by Chew. 


L. Van Hove. — At center of mass energies of the order of 1 Gev 
and higher the vacuum polarization effects of strongly interacting 
particles become a more important part of radiative corrections 
than at low energies. Could Feynman comment on the implications 
of this fact if quantum electrodynamics would be “ exact, although 
incomplete ”’ ? 


R.P. Feynman. — I do not want to give a precise answer to this 
question. It might be possible to write electrodynamics with e and p, 
which would be complete but incorrect. 
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R. Peierls. — Even if there exists a modified form of electro- 
dynamics in which there are no infinities it may not be easy to dis- 
cover this from the study of electrodynamics by itself, because the 
modifications are likely to relate to extremely high energies (small 
distances) and their discussion becomes very academic. The similar 
difficulties in the theory of strong interactions are very substantial 
in the region of practical interest. It therefore seems likely that 
we may first discover the remedy (if it exists) in the strong inter- 
actions, and then recognize how to remove the troubles of electro- 
dynamics by similar means. 


A. Salam. — I would like to come back to Heitler’s formula 
connecting energy and momentum for an electron and the violation 
of Lorentz invariance. If so perhaps we could have a discussion of 
the experimental situation. 


W. Heitler. — Concerning the variation of mass with velocity: 
the formula given in my report for the self mass dm(p) refers to free 
electrons only (not to bound electrons), The coefficients depend 
very much on the choice of the form factor and it may even be that 
a form factor exists for which 5m is independent of p. Personnally, 
of course, I do not believe that any departure of this sort from in- 
variance exists, this was merely meant to show what kind of results 
arise when one insists on the finiteness of the theory, and to suggest 
that such fundamental relation as Einstein’s mass-velocity relation, 
should be checked as accurately as possible by experiments. 


R.P. Feynman. — If the relativity formula is wrong, there are 
two masses that can be defined : the rest mass or self energy, and 
the coefficient of v2/2 in the energy for small velocities or the 
“kinetic mass ’’. If gravity acts on energy, or, therefore rest mass, 
and “kinetic mass’’ represents inertia then we know from the 
experiment of Eotvés that they differ by less than one part in 10-8. 
Therefore the electromagnetic part of the proton mass, being of 
order 10-3 must have the right coefficient of v2 to order 10-5. 
I pointed out yesterday that if the electron energy had a velocity 
dependence of the form mo(1 + 1 v2/c? + ai4/c4) the Lamb exper- 
iment shows that a = 3/8 to one part in 10~ © so that if a fractional 
part of the mass of order 0.1% is electromagnetic and varies in a 
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new way with velocity, then it must have the right coefficient of v4 
correct to one part in 10- 3, 


L. Van Hove. — Present mass determination of the particles at 
ultrarelativistic energies are, I think, of an order of accuracy of a 
few percent. Is this sufficient for the question you raised ? 


W. Heitler. — I think that an accuracy of 1% would not be 
good enough. 


H.A. Bethe. — I want to draw your attention to two experiments, 
the first of which may be legitimate, in Heitler’s sense, and the 
second may not be. The first experiment compares directly the rest 
energy of an electron to the kinetic mass of the electron; the kinetic 
mass is known with absolute accuracy; the rest energy can be 
measured by determining the energy necessary to produce an electron 
and a positron from radioactivity; this is known in some cases to 
a few parts in ten thousand and therefore the equality of rest mass 
and kinetic mass is established to a few parts in ten thousand. An 
even more direct and accurate way to measure the rest energy is 
by means of the wave length of annihilation radiation which has 
been measured by Dumond at Cal. Tech., also to an accuracy of 
a few parts in 10,000. The second experiment, I want to mention, 
refers to high energy; one can measure the total energy of the 
particle and measure the difference between the velocity of the 
particle and the velocity of light. This difference for synchrotrons 
giving electrons of 1 Gev is something like one part in 107, and 
one can measure this difference very accurately by measuring the 
total intensity of the light emitted in synchrotron radiation. One 
can measure this difference to something like 1% by measuring the 
intensity of the light; this has been done at 300 Mev and there is 
a project for 1 Gev but I am sure it would give the right result. This 
would give the velocity of the particle to one part in 109. Thus it 
is established with phenomenal accuracy that the velocity of an 
electron actually approaches that of light. The rest mass of a fast 
electron is measured by the difference 1 — 8 and is therefore known, 
for 300 Mev electrons, to an accuracy of about 1%. It is of course 
equal to the familiar kinetic mass. All the experiments I mentioned 
refer to free electrons. 
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III. Path Integrals and Operator Calculus: QED and Other Applications 


The Feynman path integral approach to quantum mechanics stands presently on a par, both 
esthetically and practically, with the original formulations of 1925-26.! The path integral 
formulation of the quantum gauge theories which lie at the heart of the Standard Model 
of elementary particle interactions turned out to be critical in the Veltman-’t Hooft proof 
that these theories are renormalizable. However, it has an even wider range of applications 
than to quantum field theories. Feynman’s book of 1965 with Albert R. Hibbs [64] uses path 
integrals to treat problems other than quantum mechanics and quantum electrodynamics, 
including statistical mechanics, the variational principle, the polaron problem, Brownian 
motion, and noise. Other applications have been made to quantum liquids and solids, to 
macromolecules and polymers, and to problems of propagation in dissipative media. The 
approach is important to various forms of semiclassical approximations in chemical, atomic, 
and nuclear problems and basic to the instanton problem (barrier penetration between differ- 
ent vacuum ground states). It can be extended to optics and even to the motion of particles 
in the strong gravitational fields near a black hole. 

As Feynman related in his Nobel Lecture [73], he developed the path integral formulation 
of quantum mechanics in order to quantize the action-at-a-distance theory of classical elec- 
trodynamics and thus avoid problems arising in field theory from the self-interaction of the 
electron. The methods of quantization associated with the names of Heisenberg, Schrédinger, 
and Dirac all begin with the Hamiltonian function as the generator of the evolution of the 
system with time, while the classical action-at-a-distance theory used a classical action prin- 
ciple based on the Lagrangian. This involved the interaction of two currents, each a function 
of an independent space-time variable. In [73] Feynman described how he discovered (with 
the help of Herbert Jehle) an infinitesimal time development operator of Dirac that involved 
the classical Lagrangian. Successive applications of this operator to the initial wave function 
generated the wave function at any later time, and was equivalent to finding the solution of 
Schrodinger’s equation. To obtain the wave function after a finite elapsed time, however, one 
had to integrate over all possible paths connecting two arbitrary space-time points. This is 
the path integral approach of Feynman. 

Although it is possible to begin with the Hamiltonian and arrive at a “manifestly covari- 
ant” relativistic QED, as shown by Julian Schwinger and Sin-itiro Tomonaga, with whom 
Feynman shared the Nobel Prize, the overall space-time point of view of Feynman lends 
itself to a relativistic formulation in a more natural way because the action, i.e. the time 
integral of the classical Lagrangian, is a relativistic invariant, while the Hamiltonian function 
is not. 

Before leaving for Los Alamos in the spring of 1942, Feynman wrote his doctoral disser- 
tation on the path integral method. The essential parts of his thesis were published in 1948 
in paper [7]. In it, the electrons are treated nonrelativistically and the electromagnetic field 
is described by its Fourier transform, i.e. by the so-called field oscillators. Fermi’s version of 
QED represented the electromagnetic field this way, and then quantized the classical oscilla- 
tors. Feynman, instead, eliminated the field oscillators, integrating out their coordinates and 


1“One cannot fail to observe that Feynman’s principle in particular — and this is no hyperbole — expresses 
the laws of quantum mechanics in an exemplary neat and elegant manner, notwithstanding the fact that it 
employs somewhat unconventional mathematics.” —W. Yourgrau and 8S. Mandelstam, Variational Principles 
in Dynamics and Quantum Theory (Philadelphia 1968), p. 128. 
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leaving only the “source current” and the “test current.” (We use the quotation marks be- 
cause the two currents are interchangeable.) All the oscillators were eliminated. In previous 
treatments the electromagnetic longitudinal and timelike degrees of freedom were combined 
to yield the Coulomb interaction, while the transverse degrees of freedom were retained as 
field oscillators. 

In the following year, 1949, Feynman published papers [12] and [13], which are included 
in Part II.C, using the rules that he had derived for QED by applying the path integral 
method. In 1950, he wrote paper [14], which establishes the validity of the rules, i.e. the 
Feynman diagram methods. This paper starts with Fermi’s formulation of the field as a 
set of oscillators and completely eliminates them, as Feynman had done already in paper 
[7]. Here, however, the charges are treated in a completely relativistic manner, using either 
the Dirac electron—positron field or, for spinless particles, the relativistic Klein—Gordon (or 
Pauli—Weisskopf) field. All virtual photons are eliminated; and Feynman shows how real 
photons can be either introduced ab initio or derived from the general formulae for virtual 
processes. 

In 1951, Feynman completed his work on QED in paper [15], describing and applying 
a new operator calculus for operators dependent on a continuous parameter (such as the 
time), which describes the ordering of application of the generally noncommuting operators. 
Using the ordering parameter, the ordinary methods of calculus (integration and power series 
expansions, for example) can be carried through without regard to the ordering until the 
final result, which is then appropriately reordered. In this paper, Feynman applied this 
method to QED and pointed out that while it did not lead to any new results, it made it 
easier to relate the path integral formulation of QED to the more conventional approaches 
of Schwinger and Tomonaga. An advantage of the ordered operator calculus over the path 
integral method is that it can deal more easily with half-integral spin.? Evidently Feynman 
hoped that his new method for dealing with operators would have a wider application, but 
that does not seem to have been the case (at least so far!). 

Among the applications of the path integral method treated by Feynman and Hibbs, 
we find statistical mechanics,? a new variational principle, Brownian motion, and “other 
problems in probability.” This is only a small subset of the many applications which have 
been found for the method, especially in condensed matter physics.4 Papers [28] and [49] 
develop the new variational principle and apply it to the polaron problem, which Feynman 
describes in [28] as follows: 


An electron in an ionic crystal polarizes the lattice in its neighborhood. This inter- 
action changes the energy of the electron. Furthermore, when the electron moves 
the polarization state must move with it. An electron moving with its accompanying 
distortion of the lattice has sometimes been called a polaron. It has an effective mass 
higher than that of the electron. We wish to compute the energy and effective mass 
of such an electron. 


Feynman developed the new variational principle and used it to obtain a lower bound 
for the polaron self-energy that was lower than that obtained in five other papers published 


?See [64], Feynman and Hibbs, p. 355. 

3See also [88]. 

4See, e.g., Martin C. Gutzwiller, “Resource letter ICQM-1: the interplay between classical and quantum 
mechanics,” Am. J. Phys. 66 (1998): 304-324. Among items dealing explicitly with Feynman path integrals 
are 71-73 and 158-168. 
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in the early 1950s. Paper [49] employs a similar method to calculate the polaron’s mobility. 
It is a collaboration between Feynman, Robert W. Hellworth (of the Hughes Research Lab- 
oratory, where Feynman was a part-time consultant), and two Caltech graduate students: 
Carl K. Iddings and Phillip M. Platzman. 

Paper [55] is part of the doctoral thesis of Frank L. Vernon, Jr., supervised by Feynman. 
It was not Feynman’s usual practice for his name to appear in publications of his students’ 
dissertations, so we can assume he was closely involved in the subject.° It deals with a 
general quantum-mechanical system (e.g. an atom or molecule) interacting with a linear 
dissipative system. The latter is represented by a collection of harmonic oscillators. In 
that sense it resembles QED and, as in that case, the oscillators’ coordinates are integrated 
out so that their total effect is replaced by that of an “influence functional.” According to 
the abstract, “In addition, a fluctuation—dissipation theorem is derived relating temperature 
and dissipation of the linear system to a fluctuating classical potential acting on the system 
of interest which reduces to the Nyquist-Johnson relation for noise in the case of electric 
circuits.” 
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Space-Time Approach to Non-Relativistic 
Quantum Mechanics 


R. P. FEYNMAN 


Cornell University, Ithaca, New York 


Non-relativistic quantum mechanics is formulated here in a different way. It is, however, 
mathematically equivalent to the familiar formulation. In quantum mechanics the probability 
of an event which can happen in several different ways is the absolute square of a sum of 
complex contributions, one from each alternative way. The probability that a particle will be 
found to have a path x(t) lying somewhere within a region of space time is the square of a sum 
of contributions, one from each path in the region. The contribution from a single path is 
postulated to be an exponential whose (imaginary) phase is the classical action (in units of 4) 
for the path in question. The total contribution from all paths reaching x, ¢ from the past is the 
wave function ¥(x, ¢). This is shown to satisfy Schroedinger’s equation. The relation to matrix 
and operator algebra is discussed. Applications are indicated, in particular to eliminate the 
coordinates of the field oscillators from the equations of quantum electrodynamics. 


1, INTRODUCTION 


T is a curious historical fact that modern 
quantum mechanics began with two quite 

different mathematical formulations: the differ- 
ential equation of Schroedinger, and the matrix 
algebra of Heisenberg. The two, apparently dis- 
similar approaches, were proved to be mathe- 
matically equivalent. These two points of view 
were destined to complement one another and 
to be ultimately synthesized in Dirac’s trans- 
formation theory. 

This paper will describe what is essentially a 
third formulation of non-relativistic quantum 
theory. This formulation was suggested by some 
of Dirac’s'? remarks concerning the relation of 


. M. Dirac, The Principles of Quantum Mechanics 
The Clteuden Press, Oxford, 1935), second siition, 
ction 33; also, Physik. Zeits. Sowjetunion 3, 64 (1933). 
?P, A, M, Dirac, Rev. Mod. Phys. 17, 195 (1945). 


classical action® to quantum mechanics. A proba- 
bility amplitude is associated with an entire 
motion of a particle as a function of time, rather 
than simply with a position of the particle at a 
particular time. 

The formulation is mathematically equivalent 
to the more usual formulations. There are, 
therefore, no fundamentally new results. How- 
ever, there is a pleasure in recognizing old things 
from a new point of view. Also, there are prob- 
lems for which the new point of view offers a 
distinct advantage. For example, if two systems 
A and B interact, the coordinates of one of the 
systems, say B, may be eliminated from the 
equations describing the motion of A. The inter- 


3 Throughout this paper the term ‘ ‘action” will be used 
for the time integral of the Lagrangian along a path. 
When this path is the one actually taken by a particle, 
moving classically, the integral should more properly be 
called Hamilton’ s first principle function. 
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action with B is represented by a change in the 
formula for the probability amplitude associated 
with a motion of A. It is analogous to the classical 
situation in which the effect of B can be repre- 
sented by a change in the equations of motion 
of A (by the introduction of terms representing 
forces acting on A). In this way the coordinates 
of the transverse, as well as of the longitudinal 
field oscillators, may be eliminated from the 
equations of quantum electrodynamics. 

In addition, there is always the hope that the 
new point of view will inspire an idea for the 
modification of present theories, a modification 
necessary to encompass present experiments. 

We first discuss the general concept of the 
superposition of probability amplitudes in quan- 
tum mechanics. We then show how this concept 
can be directly extended to define a probability 
amplitude for any motion or path (position vs. 
time) in space-time. The ordinary quantum 
mechanics is shown to result from the postulate 
that this probability amplitude has a phase pro- 
portional to the action, computed classically, for 
this path. This is true when the action is the time 
integral of a quadratic function of velocity. The 
relation to matrix and operator algebra is dis- 
cussed in a way that staysas close to the language 
of the new formulation as possible. There is no 
practical advantage to this, but the formulae are 
very suggestive if a generalization to a wider 
class of action functionals is contemplated. 
Finally, we discuss applications of the formula- 
tion. As a particular illustration, we show how 
the coordinates of a harmonic oscillator may be 
eliminated from the equations of motion of a 
system with which it interacts. This can be ex- 
tended directly for application to quantum elec- 
trodynamics. A formal extension which includes 
the effects of spin and relativity is described. 


2. THE SUPERPOSITION OF PROBABILITY 
AMPLITUDES 
The formulation to be presented contains as 
its essential idea the concept of a probability 
amplitude associated with a completely specified 
motion as a function of time. It is, therefore, 
worthwhile to review in detail the quantum- 
mechanical concept of the superposition of proba- 
bility amplitudes. We shall examine the essential 
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changes in physical outlook required by the 
transition from classical to quantum physics, 

For this purpose, consider an imaginary experi- 
ment in which we can make three measurements 
successive in time: first of a quantity A, then 
of B, and then of C. There is really no need for 
these to be of different quantities, and it will do 
just as well if the example of three successive 
position measurements is kept in mind. Suppose 
that a is one of a number of possible results which 
could come from measurement A, 5 is a result 
that could arise from B, and ¢ is a result possible 
from the third measurement C.4 We shall assume 
that the measurements A, B, and C are the type 
of measurements that completely specify a state 
in the quantum-mechanical case. That is, for 
example, the state for which B has the value b is 
not degenerate. 

It is well known that quantum mechanics deals 
with probabilities, but naturally this is not the 
whole picture. In order to exhibit, even more 
clearly, the relationship between classical and 
quantum theory, we could suppose that classi- 
cally we are also dealing with probabilities but 
that all probabilities either are zero or one. 
A better alternative is to imagine in the classical 
case that the probabilities are in the sense of 
classical statistical mechanics (where, possibly, 
internal coordinates are not completely specified). 

We define P,, as the probability that if meas- 
urement A gave the result a, then measurement B 
will give the result 5. Similarly, P,. is the proba- 
bility that if measurement B gives the result 8, 
then measurement C gives c. Further, let Pace be 
the chance that if A gives a, then C gives c. 
Finally, denote by Pa. the probability of all 
three, i.e., if A gives a, then B gives b, and C 
gives c. If the events between @ and 6 are inde- 
pendent of those between 0 and ¢, then 


Pac= PaP re (1) 


This is true according to quantum mechanics 
when the statement that B is 5 is a complete 
specification of the state. 


‘For our discussion it is not important that certain 
values of @, b, or ¢ might be excluded by quantum me- 
chanics but not by classical mechanics, For simplicity, 
assume the values are the same for both but that the 
probability of certain values may be zero, 
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In any event, we expect the relation 


Pac= DL, Pate (2) 
b 


This is because, if initially measurement A gives 
a and the system is later found to give the result 
¢ to measurement C, the quantity B must have 
had some value at the time intermediate to A 
and C. The probability that it was 0 is Pas. 
We sum, or integrate, over all the mutually 
exclusive alternatives for 6 (symbolized by >). 

Now, the essential difference between classical 
and quantum physics lies in Eq. (2). In classical 
mechanics it is always true. In quantum me- 
chanics it is often false. We shall denote the 
quantum-mechanical probability that a measure- 
ment of C results in c when it follows a measure- 
ment of A giving a by P,.%. Equation (2) is 
replaced in quantum mechanics by this remark- 
able law :5 There exist complex numbers ga, go<, 
Yac Such that 


Pas= | gao|?, Pre= | gre|*, and Pu2 = | gac|?. (3) 


The classical law, obtained by combining (1) 


and (2), 
Prac = L PiurP re (4) 


is replaced by 
Pac = L Pab Pbe+ (5) 


If (5) is correct, ordinarily (4) is incorrect. The 
logical error made in deducing (4) consisted, of 
course, in assuming that to get from a to c the 
system had to go through a condition such that 
B had to have some definite value, 0. 

If an attempt is made to verify this, i.e., if B 
is measured between the experiments A and C, 
then formula (4) is, in fact, correct. More pre- 
cisely, if the apparatus to measure B is set up 
and used, but no attempt is made to utilize the 
results of the B measurement in the sense that 
only the A to C correlation is recorded and 
studied, then (4) is correct. This is because the B 
measuring machine has done its job; if wé wish, 
we could read the meters at any time without 


5 We have assumed 6 is a non-degenerate state, and that 
therefore (1) is true, Presumably, if in some generalization 
of quantum mechanics (1) were not true, even for pure 
states 6, (2) could be expected to be replaced by: There 
are complex numbers goo such that Pare = | abe |% The ana- 
log of (5) is then yac=Zu gabe. 
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disturbing the situation any further. The experi- 
ments which gave a and ¢ can, therefore, be 
separated into groups depending on the value 
of 5. 

Looking at probability from a frequency point 
of view (4) simply results from the statement 
that in each experiment giving a and c, B had 
some value. The only way (4) could be wrong is 
the statement, ‘‘B had some value,’” must some- 
times be meaningless. Noting that (5) replaces 
(4) only under the circumstance that we make 
no attempt to measure B, we are led to say that 
the statement, ‘‘B had some value,” may be 
meaningless whenever we make no attempt to 
measure B.® 

Hence, we have different results for the corre- 
lation of a and c, namely, Eq. (4) or Iq. (5), 
depending upon whether we do or do not attempt 
to measure B, No matter how subtly one tries, 
the attempt to measure B must disturb the 
system, at least enough to change the results 
from those given by (5) to those of (4).7 That 
measurements do, in fact, cause the necessary 
disturbances. and that, essentially, (4) could be 
false was first clearly enunciated by Heisenberg 
in his uncertainty principle. The law (5) is a 
result of the work of Schroedinger, the statistical 
interpretation of Born and Jordan, and the 
transformation theory of Dirac.® 

Equation (5) is a typical representation of the 
wave nature of matter. Here, the chance of 
finding a particle going from a to ¢ through 
several different routes (values of 5) may, if no 
attempt is made to determine the route, be 
represented as the square of a sum of several 
complex quantities—one for each available route. 


®It does not help to point out that we could have 
measured B had we wished. The fact is that we did not. 

7 How (4) actually results from (5) when nieasurements 
disturb the system has been studied particularly by J. von 
Neumann (Mathematische Grundlagen der Quantenmechanik 
(Dover Publications, New York, 1943)). The effect of 
perturbation of the measuring equipment is effectively to 
change the phase of the interfering components, by 6, say, 
so that (5) becomes gac=2s e™>yaryte. However, as von 
Neumann shows, the phase shifts must remain unknown 
if B is measured so that the resulting probability Pa: is 
the square of yar averaged over all phases, 6. This results 


(4). 

® If Aand Bare the operators corresponding to measure- 
ments A and B, and if Yo and y are solutions of Ava =ayu 
and By, =bx», then yar = Sxs*Yadx = (xo, Yu). Thus, yar is 
an element (a|5) of the transformation matrix for the 
transformation from a representation in which A is 
diagonal to one in which B is diagonal. 
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Probability can show the typical phenomena 
of interference, usually associated with waves, 
whose intensity is given by the square of the 
sum of contributions from different sources. The 
electron acts as a wave, (5), So to Speak, as long 
as no attempt is made to verify that it is a 
particle; yet one can determine, if one wishes, 
by what route it travels just as though it were a 
particle; but when one does that, (4) applies and 
it does act like a particle. 

These things are, of course, well known. They 
have already been explained many times.® How- 
ever, it seems worth while to emphasize the fact 
that they are all simply direct consequences of 
Eq. (5), for it is essentially Eq. (5) that is funda- 
mental in my formulation of quantum mechanics. 

The generalization of Eqs. (4) and (5) to a 
large number of measurements, say A, B, C, D, 
«++, K, is, of course, that the probability of the 
sequence a, b,c, d, «++, kis 


Praved»+k a | Pabed=+*k | 4. 
The probability of the result a, c, k, for example, 
if b, d, --+ are measured, is the classical formula: 


Pack= 2X x a *Prabeds +k; (6) 


while the probability of the same sequence a, c, k 
if no meaSurements are made between A and C 
and between C and K is 


Paci? = |X +++ Padede-rk | ®. (7) 
’ a 

The quantity gated...’ We can call the probability 

amplitude for the condition A=a, B=b, C=c, 

D=d,:-::, K=k. (It is, of course, expressible as 

a product parpsc Ped" ** Pik) 


3. THE PROBABILITY AMPLITUDE FOR A 
SPACE-TIME PATH 

The physical ideas of the last section may be 
readily extended to define a probability ampli- 
tude for a particular completely specified space- 
time path. To explain how this may be done, we 
shall limit ourselves to a one-dimensional prob- 
lem, as the generalization to several dimensions 
is obvious. 


*See, for example, W. Heisenberg, The Physical Prin- 
epee of the Quantum Theory (University of Chicago Press, 
Chicago, 1930), particularly Chapter IV. 
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Assume that we have a particle which can 
take up various values of a coordinate x. Imagine 
that we make an enormous number of successive 
position measurements, let us say separated by a 
small time interval «. Then a succession of 
measurements such as 4, B, C, +++ might be the 
succession of measurements of the coordinate x 
at successive times é1, fo, f3, °°", where t,,,; =f: +6. 
Let the value, which might result from measure- 
ment of the coordinate at time é,;, be x; Thus, 
if A is a measurement of x at ¢,; then x, is what 
we previously denoted by a. From a classical 
point of view, the successive values, %1, %2, %3, «°° 
of the coordinate practically define a path x(¢). 
Eventually, we expect to go the limit «0. 

The probability of such a path is a function 
of %1, %2, «*", Xi, +, Say P(x. Xeqa, 1+)» 
The probability that the path lies in a particular 
region R of space-time is obtained classically by 
integrating P over that region. Thus, the proba- 
bility that x; lies between a; and b;, and x;41 lies 
between @;4; and 5.41, etc., is 


be Aditi 
vf f PC ey Kitt) dade: 
at Vast 


the symbol /e meaning that the integration is 
to be taken over those ranges of the variables 
which lie within the region R. This is simply 
Eq. (6) with a, 6, --+ replaced by x1, #2, --- and 
integration replacing summation. 

In quantum mechanics this is the correct 
formula for the case that x}, %2, °°", %:, °** were 
actually all measured, and then only those paths 
lying within R were taken. We would expect the 
result to be different if no such detailed measure- 
ments had been performed. Suppose a measure- 
ment is made which is capable only of deter- 
mining that the path lies somewhere within R. 

The measurement is to be what we might call 
an “ideal measurement.”’ We suppose that no 
further details could be obtained from the same 
measurement without further disturbance to the 
system. I have not been able to find a precise 
definition. We are trying to avoid the extra 
uncertainties that must be averaged over if, for 
example, more information were measured but 
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not utilized. We wish to use Eq. (5) or (7) for 
all x; and have no residual part to sum over in 
the manner of Eq. (4). 

We expect that the probability that the par- 
ticle is found by our ‘ideal measurement” to be, 
indeed, in the region R is the square of a complex 
number |¢(R)]% The number ¢(R), which we 
may call the probability amplitude for region R 
is given by Eq. (7) with a, b, ++ replaced by 


Xiy Bigs and summation replaced by in- 
tegration: 
¢(R) =Lim 
<0 R 
KOC ey, tyre) dxdeiyye:. (9) 


The complex number ®(---%;, %:41°-+) is a func- 
tion of the variables x; defining the path. 
Actually, we imagine that the time spacing ¢ ap- 
proaches zero so that ® essentially depends on 
the entire path «(¢) rather than only on just the 
values of x; at the particular times 4, x; =2(t,). 
We might call © the probability amplitude func- 
tional of paths x(é). 

We may summarize these ideas in our first 
postulate: 

I. If an ideal measurement is performed to 
determine whether a particle has a path lying in a 
region of space-time, then the probability that the 
result will be affirmative is the absolute square of a 
sum of complex contributions, one from each path 
in the region. 

The statement of the postulate is incomplete. 
The meaning of a sum of terms one for ‘‘each’’ 
path is ambiguous. The precise meaning given 
in Eq. (9) is this: A path is first defined only by 
the positions x; through which it goes at a 
sequence of equally spaced times,! ¢;=t: 1+. 
Then all values of the coordinates within R have 
an equal weight. The actual magnitude of the 
weight depends upon ¢ and can be so chosen 
that the probability of an event which is certain 


10There are very interesting mathematical problems 
involved in the attempt to avoid the subdivision and 
limiting processes. Some sort of complex measure is being 
associated with the space of functions x(t). Finite results 
can be obtained under unexpected circumstances because 
the measure is not Positive everywhere, but the contribu- 
tions from most of the paths largely cancel out. These 
curious mathematical problems are sidestepped by the sub- 
division process. However, one feels as Cavalieri must 
have felt calculating the volume of a pyramid before the 
invention of calculus. 
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shall be normalized to unity. It may not be best 
to do so, but we have left this weight factor ina 
proportionality constant in the second postulate. 
The limit e-0 must be taken at the end of a 
calculation. 

When the system has several degrees of free- 
dom the coordinate space x has several dimen- 
sions so that the symbol x will represent a set of 
coordinates (x, x@, +--+, x) for a system with 
k degrees of freedom. A path is a sequence 
of configurations for successive times and is 
described by giving the configuration x; or 
(2), 2,@, +--+, x), Le, the valne of each of 
the & coordinates for each time ¢; The symbol dx; 
will be understood to mean the volume element 
in k dimensional configuration space (at time #,). 
The statement of the postulates is independent 
of the coordinate system which is usecl. 

The postulate is limited to defining the results 
of position measurements. It does not say what 
must be done to define the result of a rnornentum 
measurement, for example. This is not a real 
limitation, however, because in principle the 
measurement of momentum of one particle can 
be performed in terms of position measurements 
of other particles, e.g., meter indicators. Thus, 
an analysis of such an experiment will determine 
what it is about the first particle which deter- 
mines its momentum. 


4. THE CALCULATION OF THE PROBABILITY 
AMPLITUDE FOR A PATH 


The first postulate prescribes the type of 
mathematical framework required by quantum 
mechanics for the calculation of probabilities. 
The second poStulate gives a particular content 
to this framework by prescribing how to compute 
the important quantity ® for each path: 

II. The paths contribute equally in magnitude, 
but the phase of their contribution is the classical 
action (in units of h); t.e., the time integral of the 
Lagrangian taken along the path. 

That is to say, the contribution $[x(é)] from a 
given path x(¢) is proportional to exp(#/h)S[x(t) ], 
where the action S[x()]= /L(4(8), x(#))dé is the 
time integral of the classical Lagrangian L(#, x) 
taken along the path in question, The Lagrangian, 
which may be an explicit function of the time, 
is a function of position and velocity. If we 
suppose it to be a quadratic function of the 
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velocities, we can show the mathematical equiva- 
lence of the postulates here and the more usual 
formulation of quantum mechanics. 

To interpret the first postulate it was necessary 
to define a path by giving only the succession of 
points x; through which the path passes at 
successive times t;. To compute S= {L(z, x)dt 
we need to know the path at all points, not just 
at x, We shall assume that the function x(é) in 
the interval between ¢; and ¢;,; is the path fol- 
lowed by a classical particle, with the Lagrangian 
L, which starting from x; at ¢; reaches x.,; at 
ti41. This assumption is required to interpret the 
second postulate for discontinuous paths. The 
quantity B(-+ +x, %41, +++) can be normalized 
(for various e) if desired, so that the probability 
of an event which is certain is normalized to 
unity as e—0. 

There is no difficulty in carrying out the action 
integral because of the sudden changes of velocity 
encountered at the times ¢; as long as L does not 
depend upon any higher time derivatives of the 
position than the first. Furthermore, unless L is 
restricted in this way the end points are not 
sufficient to define the classical path. Since the 
classical path is the one which makes the action 
a minimum, we can write 


S=D S(¥is1, x4), (10) 
i 


where 
tet 
S644 03) = Min. L(a(t), x(¢))de. 


te 


(11) 


Written in this way, the only appeal to classical 
mechanics is to supply us with a Lagrangian 
function. Indeed, one could consider postulate 
two as simply saying, ‘‘ is the exponential of 7 
times the integral of a real function of x(t) and 
its first time derivative.” ‘Then the classical 
equations of motion might be derived later as 
the limit for large dimensions. The function of x 
and z then could be shown to be the classical 
Lagrangian within a constant factor. 

Actually, the sum in (10), even for finite e, is 
infinite and hence meaningless (because of the 
infinite extent of time). This reflects a further 
incompleteness of the postulates, We shall have 
to restrict ourselves to a finite, but arbitrarily 
long, time interval. 
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Combining the two postulates and using Eq. 
(10), we find 


eR Sake f 
<0 


R 
: ES ] é 

Xexp] - Xipn a aS ia aay: 
Peas A A 


where we have let the normalization factor be 
split into a factor 1/A (whose exact value we 
shall presently determine) for each instant of 
time. The integration is juSt over those values 
Xi, %is1, °+* which lie iu the region R. This 
equation, the definition (11) of S(xi41, x3), and 
the physical interpretation of | (R)|* as the 
probability that the particle will be found in R, 
complete our forinulation of quantum mechanics. 


5. DEFINITION OF THE WAVE FUNCTION 


We now proceed to show the equivalence of 
these postulates to the ordinary formulation of 
quantum mechanics. This we do in two steps. 
We show in this section how the wave function 
may be defined from the new point of view. In 
the next section we shall show that this func- 
tion satisfies Schroedinger’s differential wave 
equation. 

We shall see that it is the possibility, (10), of 
expressing S aS a sum, and hence ® as a product, 
of contributions from successive sections of the 
path, which leads to the possibility of defining 
a quantity having the properties of a wave 
function. 

To make this clear, let us imagine that we 
choose a particular time ¢ and divide the region R 
in Eq. (12) into pieces, future and past relative 
to ¢. We imagine that R can be split into: (a) a 
region R’, restricted in any way in Space, but 
lying entirely earlier in time than some ¢’, such 
that ¢’ <¢; (b) a region R” arbitrarily restricted 
in Space but lying entirely later in time than ¢”, 
such that ¢/’>#; (c) the region between ¢’ and ¢”’ 
in which all the values of x coordinates are un- 
restricted, i.e., all of space-time between #’ and ¢””. 
The region (c) is not absolutely necessary. It can 
be taken as narrow in time as desired. However, 
it is convenient in letting us consider varying ¢a 
little without having to redefine R’ and R”. 
Then | ¢(R’, R’”’)|? is the probability that the 


NON-RELATIVISTIC QUANTUM 


path occupies R’ and R”. Because &’ is entirely 
previous to R’’, considering the time ¢ as the 
present, we can express this as the probability 
that the path had been in region R’ and will be 
in region R’’. If we divide by a factor, the proba- 
bility that the path is in R’, to renormalize the 
probability we find: | o(R’, R”’) |? is the (relative) 
probability that if the system were in region R’ 
it will be found later in R’’. 

This is, of course, the important quantity in 
predicting the results of many experiments. We 
prepare the Systeni in a certain way (e.g., it was 
in region R’) and then measure some other 
property (c.g., will it be found in region R’’?). 
What does (12) say about computing this 
quantity, or rather the quantity »(R’, R”’) of 
which it is the square? 

Let us suppose in Eq. (12) that the time ¢ 
corresponds to one particular point & of the sub- 
division of time into steps ¢, ie., assumie /=¢s, 
the index k, of course, depending upon the 
subdivision e Then, the exponential being the 
exponential of a stim may be split into a prodict 
of two factors 


[is 
exp!- SD S(xu), » | 


to issk 


1 kent 
-exp| = Dw S(t | (13) 


tao 


The first factor contains only coordinates with 
index & or higher, while the second contains only 
coordinates with index & or lower. This split is 
possible because of Eq. (10), which results essen- 
tially from the fact that the Lagrangian is a 
function only of positions and velocities. First, 
the integration on all variables x; for <> can 
be performed on the first factor resulting in a 
function of x, (tintes the second factor). Next, 
the integration on all variables x, for 1<k can 
be perfornied on the second factor also, giving a 
function of x,. Finally, the integration on x, can 
be perfornied. That is, o(R’, R’’) can be written 
as the integral over x, of the product of two 
factors. We will call these x*(xx, £) and (xz, é): 


o(R’, R”) = fxr, bHy(x, dx, (14) 
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where 
v(x, ) =Lim f 
0 Sp, 
t kl xy 1 AX, —9 
exp D> Stu v9] —— ween, (15) 
iL ie=—O 
and ; 
rc) 
x* (Xx, £) =Lim f esr D S(xist, ») | 
eS yar hive=k 
j Axes dX 42 
~ aa eee (16) 


A A A 

The symbol R’ is placed on the integral for p 
to indicate that the coordinates are integrated 
over the region R’, and, for ¢; between ¢’ and #, 
over all space. In like manner, the integral for x* 
is over R” ancl over all space for those coordinates 
corresponding to times between ¢ and ¢’. The 
asterisk on x* denotes coniplex conjugate, as it 
will be found ntore convenient to define (16) as 
the complex conjugate of sone quantity, x. 

The quantity » depends only upon the region 
R’ previous to #, aud is completely defined if 
that region is known. It does not depend, in 
any way, upon what will be done to the system 
after time ¢. This latter information is contained 
in x. Thus, with y and x we have separated the 
past history from the future experiences of the 
system. This permits us to speak of the relation 
of past and future in the conventional manner. 
Thus, if a particle has been in a region of space- 
time R’ it may at time ¢ be said to be in a certain 
condition, or state, determined only by its past 
and described by the so-called wave function 
¥(x, 4). This fusction contains all that is needed 
to predict future probabilities. For, suppose, in 
another situation, the region R’ were different, 
say r’, and possibly the Lagrangian for times 
before ¢ were also altered. But, nevertheless, 
suppose the quantity froni Eq. (15) turned out 
to be the same. Then, according to (14) the 
probability of ending in any region R’” is the 
sanie for R’ as for r’. Therefore, future measure- 
ments will not distinguish whether the system 
had occupiecl R’ or r’. Thus, the wave function 
¥(x,t) is sufficient to define those attributes 
which are left from past history which determine 
future behavior. 
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Likewise, the function x*(«, 4) characterizes 
the experience, or, let us say, experiment to 
which the system is to be subjected. If a different 
region, r’’ and different Lagrangian after ¢, were 
to give the same x*(x,¢) via Eq. (16), as does 
region R’”’, then no matter what the preparation, 
¥, Eq. (14) says that the chance of finding the 
system in R” is always the same as finding it 
in r’’. The two ‘experiments’ R” and r” are 
equivalent, as they yield the same results. We 
shall say loosely that these experiments are to 
determine with what probability the system is 
in state x. Actually, this terminology is poor. 
The system is really in state y. The reason we 
can associate a state with an experiment is, of 
course, that for an ideal experiment there turns 
out to be a unique state (whose wave function is 
x(x, ¢)) for which the experiment succeeds with 
certainty. 

Thus, we can say: the probability that a 
system in state y will be found by an experiment 
whose characteristic state is x (or, inore loosely, 
the chance that a system: in state y will appear 
to be in x) is 


° 


f xt(e, W(x, ax (17) 


These results agree, of course, with the prin- 
ciples of ordinary quantum mechanics. They are 
a consequence of the fact that the Lagrangian 
isa function of position, velocity, and time only. 


6. THE WAVE EQUATION 


To complete the proof of the equivalence with 
the ordinary formulation we shall have to show 
that the wave function defined in the previous sec- 
tion by Eq. (15) actually satisfies the Schroedinger 
wave equation. Actually, we shall only succeed 
in doing this when the Lagrangian Z in (11) isa 
quadratic, but perhaps inhomogeneous, form in 
the velocities z(#). This is not a limitation, how- 
ever, as it includes all the cases for which the 
Schroedinger equation has been verified by ex- 
periment. 

The wave equation describes the development 
of the wave function with time. We may expect 
to approach it by noting that, for finite e, Eq. (15) 
permits a simple recursive relation to be de- 
veloped. Consider the appearance of Eq. (15) if 
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we were to compute y at the next instant of time: 


4 


Vien CFO = exo| m Say x | 
Ri 


h tee 


dx, dtp) 
x—_ ; 
AA 


(15’) 


This is similar to (15) except for the integration 
over the additional variable x, and the extra 
term in the sum in the exponent. This term 
means that the integral of (15’) is the same 
as the integral of (15) except for the factor 
(1/A) exp(i/h) S(x.41, %,). Since this does not 
contain any of the variables x; for ¢ less than &, 
all of the integrations on dx; up to dx,_; can be 
performed with this factor left out. However, 
the result of these integrations is by (15) simply 
¥(%x, £). Hence, we find from (15’) the relation 


W(%i4a, + €) 
-f exp|—S(e14 ws) Wen thdx,/A. (18) 


This relation giving the development of y with 
time will be shown, for siniple examples, with 
suitable choice of A, to be equivalent to 
Schroedinger’s equation. Actually, Eq. (18) isnot 
exact, but is only true in the limit «230 and we 
shall derive the Schroedinger equation by assum- 
ing (18) is valid to first order in ¢. The Eq. (18) 
need only be true for small € to the first order in e. 
For if we consider the factors in (15) which carry 
us over a finite interval of time, JT, the number 
of factors is T/e. If an error of order é is made in 
each, the resulting error will not accumulate 
beyond the order ¢(7/¢e) or Te, which vanishes 
in the limit. 

We shall illustrate the relation of (18) to 
Schroedinger’s equation by applying it to the 
simple case of a particle moving in one dimension 
in a potential V(x). Before we do this, however, 
we would like to discuss some approximations to 
the value S(x;4:, %s) given in (11) which will be 
sufficient for expression (18). 

The expression defined in (11) for S(#.41, ¥,) is 
difficult to calculate exactly for arbitrary ¢ from 
classical mechanics. Actually, it is only necessary 
that an approximate expression for S(x;,1, x:) be 
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used in (18), provided the error of the approxi- 
mation be of an order smaller than the first in e. 
We limit ourselves to the case that the Lagrangian 
ig a quadratic, but perhaps inhomogeneous, form 
in the velocities 2(#). As we shall see later, the 
paths which are important are those for which 
X%e41—%, 18 of order &. Under these circumstances, 
it is sufficient to calculate the integral in (11) 
over the classical path taken by a free particle.!! 

In Cariesian coordinates’? the path of a free 
particle is a straight line so the integral of (11) 
can be taken along a straight line. Under these 
circumstances it is sufficiently accurate to replace 
the integral by the trapezoidal rule 


€ ig Te, 
Stiivs 8) = E( svat) 


€ 


or, if it proves more cotivenient, 


Xiepp 7X, X +x; 
pee ==). (20) 


€ 


S(er41, % 


These are uot valid in a general coordinate 
system, e.g., Spherical. An even simpler approxi- 
mation ntay be used if, in addition, there is no 
vector potential or other terms linear in the 
velocity (see page 376): 


Lig tr Xy 
S(Xip1, Xi) = eL ——, Xeua }. 


€ 


(21) 


Thus, for the simple example of a particle of 
inass m moving in one dimension under a poten- 
tial V(~), we can set 


"It is assumed that the “forces” enter through a scalar 
and vector potential and not in terms involving the square 
of the velocity. More generaliy, what ts meant by a free 
particle is one for which the Lagrangian is altered by 
omission of the terms linear in, and those independent of, 
the velocities. 

More generally, coortinates for which the ternts 
quadratic in the velocity in L(z, x) appear with constant 
coefficients. 
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For this example, then, Eq. (18) becomes 


Vk _ 2 
V4, t+O= fool" [=(= *) 
€ 


= V(xe01) | Jeon thdx./A. (23) 


Let us call xp,,=% and wei -ae=& so that 
x,=«—£. Then (23) becomes 


—ie V(a) 


W(X, oe (24) 


“exp: 


The integral on & will converge if p(x, é) 
falls off sufficiently for large x (certainly if 
JS ¥*(x)¥(x)dx = 1). In the integration on &, since 
€ is very stall, the exponential of im£/2he oscil- 
lates extremely rapidly except in the region 
about &=0 (£ of order (he/m)4). Since the func- 
tion ¥(x—&,t) 1s a relatively smooth function 
of £ (Since « may be taken as small as desired), 
the region where the exponential oscillates rapidly 
will contribute very little because of the almost 
complete cancelation of positive and negative 
contributions. Since only small & are effective, 
¥(x—&, f) may be expanded as a Taylor series. 
Hence, 


¥(x, +e) =exp( 


tm? 
x f exo( 
dhe 


ie V(x) 
=) 


Oy (x, £) 
) ie sae ee 


x 
£ a*y(x, t 
+—- ae + jaar. (25) 
2 ax? 
Now 
f exp(tm&?/2he)dt = (2ahei/m)}, 
f exp(im£?/2he) tdé = 0, (26) 


—2 


f exp (im &?/2he) dt = (het /m) (2arher/m)}, 


—» 


while the integral containing & is zero, for like 
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the one with & it possesses an odd integrand, 
and the ones with £4 are of at least the order ¢ 
smaller than the ones kept here.!3 If we expand 
the left-hand side to first order in e, (25) becomes 


Oy(x, £) 
v(x, t) +e—— 
ot 
( —ie =) (2rhei/m)+ 
=exp{ - 
h A 
het a°y(x, t 
x, t) cee ey oe ‘| (27) 
m ax? 


In order that both sides may agree to zero order 
in €, We must set 


A = (2xhei/m)). (28) 


Then expanding the exponential containing V(x), 
we get 


oy te 
¥(x, £) be = (: = vc) 
at h 


het oy 
«(ve é) +), (29) 
2m dx? 


Canceling (x, £) from both sides, and com- 
paring terms to first order in « and multiplying 
by —#/i one obtains 


hoy 1 


(30) 
4 ot 2m 


hay? 
7: -) V+ V(x)y, 


i dx 


which is Schroedinger's equation for the problein 
in question. 

The equation for x* can be developed in the 
same way, but adding a factor decreases the time 
by one step, i.e., x* satisfies an equation like (30) 
but with the sign of the time reversed. By taking 
complex conjugates we can conclude that x 
satisfies the same equation as y, i.€., an experi- 
ment can be defined by the particular state x to 
which it corresponds.'4 


8 Really, these integrals are oscillatory and not defined, 
but they may be defined by using a convergence factor. 
Such a factor is automatically provided by ¥(x—&, ¢) in 
(24). If a more formal procedure js desired replace & by 
h(t —i8), for example, where 6 is a small positive number, 
and then let 6-+0. 

44 Dr. Hartland Snyder has pointed out to me, in private 
conversation, the very interesting possibility that there 
may be a generalization of quantum mechanics in which the 
states measured by experiment cannot be prepared; that 
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This example shows that most of the contribu- 
tion to ¥(x.41,#+6) comes from values of x, in 
¥(xx, é) which are quite close to x.41 (distant of 
order e') so that the integral equation (23) can, in 
the limit, be replaced by a differential equation. 
The “velocities,” (x:41—2%)/€ which are im- 
portant are very high, being of order (h/me)! 
which diverges as «0. The paths involved are, 
therefore, continuous but possess no derivative. 
They are of a type familiar from study of 
Brownian motion. 

It is these large velocities which make it 
so necessary to be careful in approximating 
S(Xe41, Xe) from Eq. (11).18 To replace V (x41) 
by V(x.) would, of course, change the exponent 
in (18) by cel V(xx) — V(%e5.1) ]/% which is of order 
€(Xk44—Xk), atid thus lead to unimportant terms 
of higher order thau e€ on the right-hand side 
of (29). It is for this reason that (20) and (21) are 
equally satisfactory approximations to S(x,4 1, x1) 
when there is no vector potential. A term, linear 
in velocity, however, arising from a vector 
potential, as Azdt must be handled more care- 
fully. Here a term in S(xe41, xe) such as A (i414) 
X (xeg1—x%) differs from A(x) (Xe41—Xe) by a 
term of order (xx4:—x%)?, and, therefore, of 
order e, Such a terin would lead to a cliange in 
the resulting wave equation. For this reason the 
approximation (21) is not a sufficiently accurate 
approximation to (11) aud one like (20), (or (19) 
from which (20) differs by terms of order higher 
than «) must be used, If A represents the vector 
potential and p=(#/1)V, the momentum oper- 
ator, then (20) gives, in the Hamiltonian operator, 
a term (1/2m)(p—(e/c)A)-(p—(e/c)A), while 
(21) gives (1/2m)(p-p—(2e/c)A-p+ (e?/c)A-A). 
These two expressions differ by (he/2imc)V-A 


is, there would be no state into which a system may be put 
for which a particular experiment gives certainty for a 
result. The class of functions x is not identical to the class 
of available states ¥. This would result if, for example, 
x satisfied a different equation than y. 

18 Equation (18) is actually exact when (11) is used for 
S(xiz1, xi) for arbitrary ¢ for cases in which the potential 
does not involve x to higher powers than the second 
(e.g., free particle, harmonic oscillator), It is necessary, 
however, to use a more accurate value of A. One can 
define A in this way. Assume classical particles with 
degrees of freedom start from the point x, 4 with uniform 
density in momentum space. Write the number of particles 
having a given component of momentum in range dp as 
dP/po with po constant. Then A =(2rhi/po)*/p-4, where p 
is the density in & dimensional coordinate space xiz1 of 
these particles at time é41. 
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which may not be zero. The question is still 
more important in the coefficient of terms which 
are quadratic in the velocities. In these terms 
(19) and (20) are not sufficiently accurate repre- 
sentations of (11) in general. It is when the 
coefficients are constant that (19) or (20) can be 
substituted for (11). If an expression such as 
(19) is used, say for spherical coordinates, when 
it is uot a valid approximation to (11), one 
obtains a Schroedinger equation in which the 
Hamiltonian operator has some of the momentuin 
operators and coordinates in the wrong order. 
Equation (11) then resolves the ambiguity in the 
usual rule to replace p and g by the non-com- 
muting quantities (4/7)(/dg) and g in the classi- 
cal Hamiltonian H(p, q). 

It is clear that the stateinent (11) is inde- 
pendent of the coordinate system. Therefore, to 
find the differential wave equation it gives in 
any coordinate system, the easiest procedure is 
first to find the equations in Cartesian coordinates 
and then to transform the coordinate system to 
the one desired. It suffices, therefore, to show the 
relation of the postulates and Schroedinger's 
equatiou in rectangular coordinates. 

The derivation given here for one dimension 
can be extended directly to the case of three- 
dimensional Cartesian coordinates for any num- 
ber, K, of particles interacting through potentials 
with one another, and in a magnetic field, 
described by a vector potential. The terms in 
the vector potential require completing the square 
in the exponent in the usual way for Gaussian 
integrals. The variable x must be replaced by 
the set x to x@*) where x, «@, x@ are the 
coordinates of the first particle of mass m, x, 
x®, x® of the second of mass mm, etc. The 
symbol dx is replaced by dxPdx®..-dx@, and 
the integration over dx is replaced by a 3K-fold 
integral. The constant A has, in this case, the 
value A = (2rhei/m,)}(2Zrhet/m2))- » «(20h /mx)}. 
The Lagrangian is the classical Lagrangian for 
the same problem, and the Schroedinger equation 
resulting will be that which corresponds to 
the classical Hamiltonian, derived from this 
Lagrangian. The equations in any other coordi- 
nate system may be obtained by transformation. 
Since this includes all cases for which Schroed- 
inger’s equation has been checked with experi- 
ment, we may say our postulates are able to 
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describe what can be described by non-relativistic 
quantum mechanics, neglecting spin. 


7. DISCUSSION OF THE WAVE EQUATION 
The Classical Limit 


This completes the denionstration of the equiv- 
alence of the new and old formulations. We 
should like to include in this section a few re- 
marks about the important equation (18). 

This equation gives the development of the 
wave function during a sinall time interval. It is 
easily interpreted physically as the expression of 
Huygens’ principle for matter waves." In geo- 
metrical optics the rays in an inhomogeneous 
mediuin satisfy Fermat's principle of least time. 
We may state Huygens’ principle in wave optics 
in this way: If the amplitude of the wave is 
known on a given surface, the amplitude at a 
near by point can be considered as a sum of con- 
tributions from all points of the surface. Each 
contribution is delayed in phase by an amount 
proportional to the ¢ime it would take the light to 
get from the surface to the point along the ray of 
least time of geometrical optics. We can consider 
(22) in au analogous manner starting with 
Hamilton’s first principle of least action for 
classical or ‘‘geometrical'’ mechanics, If the 
amplitude of the wave wy is known on a given 
“surface,"’ in particular the ‘‘surface’’ consisting 
of all x at time é, its value at a particular nearby 
point at time é+, is a sum of contributions from 
all points of the surface at ¢. Each contribution is 
delayed in phase by an amount proportional to 
the action it would require to get from the surface 
to the point along the path of least action of 
classical mechanics. '* 

Actually Huygens’ principle is not correct in 
optics. It is replaced by Kirchoff’s modification 
which requires that both the amplitude and its 
derivative must be known on the adjacent sur- 
face. This is a consequence of the fact that the 
wave equation in optics is second order in the 
time. The wave equation of quantum mechanics 
is first order in the time; therefore, Huygens’ 
principle ¢s correct for matter waves, action re- 
placing time. 


16 See in this connection the very interesting remarks of 
Schroedinger, Ann, d. Physik 79, 489 (1926). 
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The equation can also be compared mathe- 
matically to quantities appearing in the usual 
formulations. In Schroedinger’s method the de- 
velopment of the wave function with time is 
given by 


(31) 


which has the solution (for any ¢ if H is time 
independent) 


W(x, §+ 6) =exp(—teH /h) p(x, £). 


Therefore, Eq. (18) expresses the operator 
exp(—ieH/h) by an approximate integral oper- 
ator for small «, 

From the point of view of Heisenberg one con- 
siders the position at time ¢, for example, as an 
operator x. The position x’ at a later time ¢#+ can 
be expressed in terms of that at time ¢ by the 
operator equation 


“= exp(teH/h)xexp — (teH/h). 


(32) 


(33) 


The transformation theory of Dirac allows us to 
consider the wave function at timeé-+ e, #(x’, f+), 
as representing a state in a representation in 
which x’ is diagonal, while p(x, ¢) represents the 
same state in a representation in which x is 
diagonal. They are, therefore, related through the 
transformation function (x’|x), which relates 
these representations: 


v(x’, t+.) = felon dx, 


Therefore, the content of Eq. (18) is to show that 
for small « we can set 


(x! |x)e= (1/4) exp(tS(x’, x)/h) 


with S(x’, x) defined as in (11). 

The close analogy between (x’|x), and the 
quantity exp(iS(x’, x)/h) has been pointed out on 
several occasions by Dirac.' In fact, we now see 
that to sufficient approximations the two quanti- 
ties may be taken to be proportional to each 
other. Dirac’s remarks were the starting point of 
the present development. The points he makes 
concerning the passage to the classical limit h-0 
are very beautiful, and I may perhaps be excused 
for briefly reviewing them here. 


(34) 
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First we note that the wave function at x” at 
time ¢” can be obtained from that at x’ at time 
t’ by 


He", £") =Lim im fi f 


xexp|— 3 = Sian ¥ >| 


xo 7 ) devo deca dX 54 
x —_— ; 
AA A 


(35) 


where we put xo=x’ and x;=x” where je=t" —t' 
(between the times ¢’ and ?’” we assume no re- 
striction is being put on the region of integration). 
This can be seen either by repeated applications 
of (18) or directly from Eq. (15). Now we ask, as 
h—0 what values of the intermediate coordinates 
x, contribute most strongly to the integral? These 
will be the values most likely to be found by ex- 
periment and therefore will determine, in the 
limit, the classical path. If # is very small, the 
exponent will be a very rapidly varying function 
of any of its variables x;. Asx; varies, the positive 
and negative contributions of the exponent 
nearly cancel. The region at which x, contributes 
most strongly is that at which the phase of the 
exponent varies least rapidly with x, (method of 
stationary phase). Call the sum in the ex- 
ponent S; 
gi 
S= ZT S(xi41, x2). (36) 
t=O 
Then the classical orbit passes, approximately, 
through those points x; at which the rate of 
change of S with x, is small, or in the limit of 
small #, zero, i.e., the classical orbit passes 
through the points at which dS/dx,;=0 for all x,. 
Taking the limit e—0, (36) becomes in view 
of (11) 
we 
s=f L(a&(6), x(2)) de. (37) 
p 
We see then that the classical path is that for 
which the integral (37) suffers no first-order 
change on varying the path, This is Hamilton's 
principle and leads directly to the Lagrangian 
equations of motion. 
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8 OPERATOR ALGEBRA 
Matrix Elements 


Given the wave function and Schroedinger's 
equation, of course all of the machinery of 
operator or matrix algebra can be developed. It 
is, however, rather interesting to express these 
coricepts in a somewhat different language more 
closely related to that used in stating the postu- 
lates. Little will be gained by this in elucidating 
operator algebra. In fact, the results are siinply a 
translation of simple operator equations into a 
somewhat more cumbersome notation. On the 
other hand, the new notation and point of view 
are very useful in certain applications described 
in the introduction, Furthermore, the form of the 
equations permits natural extension to a wider 
class of operators than is usually considered 
(e.g., ones involving quantities referring to two or 
more different times). If any generalization to a 
wider class of action functionals is possible, the 
formulae to be developed will play an important 
role. 

We discuss these points in the next three 
sections. This section is concerned mainly with 
definitions. We shall define a quantity which we 
call a transition element between two states. It 
is essentially a matrix element. But instead of 
being the matrix element between a state y and 
another x corresponding to the same time, these 
two States will refer to different times. In the 
following section a fundamental relation between 
transition elements will be developed from which 
the usual commutation rules between coordinate 
and momentum may be deduced. The same 
relation also yields Newton's equation of motion 
in matrix form. Finally, in Section 10 we discuss 
the relation of the Hamiltonian to the operation 
of displacement in time. 

We begin by defining a transition element in 
terms of the probability of transition from one 
state to another. More precisely, suppose we have 
a situation similar to that described in deriving 
(17). The region R consists of a region R’ previous 
to t’, all space between ¢’ and ¢” and the region R”’ 
after é’’, We shall study the probability that a 
system in region R’ is later found in region R”. 
This is given by (17). We shall discuss in this 
section how it changes with changes in the form 
of the Lagrangian between ¢’ and ¢”. In Section 10 
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we discuss how it changes with changes in ‘the 
preparation R’ or the experiment R”, 

The state at time é’ is defined completely by the 
preparation R’. It can be specified by a wave 
function ¥(x’,¢’) obtained as in (15), but con- 
taining only integrals up to the time ¢’. Likewise, 
the state characteristic of the experiment (region 
R”) can be defined by a function x(x”, t’’) ob- 
tained from (16) with integrals only beyond ¢’’. 
The wave function p(x’, ¢’’) at time ¢” can, of 
course, also be gotten by appropriate use of (15). 
It can also be gotten from p(x’, #’) by (35). Ac- 
cording to (17) with é¢” used instead of #, the 
probability of being found in x if prepared in y is 
the square of what we shall call the transition 
amplitude J'x*(x”, t')y(x", t’)dx”. We wish to 
express this in terms of x at ¢’’ and pat #’. This we 
can do with the aid of (35). Thus, the chance that 
a system prepared in state we at time ¢’ will be 
found after ¢” to be in a state x1 is the square of 
the transition amplitude 


(xer| 1 |ve)s=Lim f. ‘ frre’ t'’) 
«0 


dx 5-4 


dxo 
Xexp(tS/h)yp(x’, sar “8 dx;, (38) 


where we have used the abbreviation (36). 

In the language of ordinary quantum me- 
chanics if the Hamiltonian, H, is constant, 
v(x, ) =exp[—7(t" —¢)H/h p(x, t’) so that (38) 
is the matrix element of exp[ —7(¢’’ —#’)H/h] be- 
tween states xe and py. 

If F is any function of the coordinates x; for 
t' <t,<t’’, we shall define the transition element 
of F between the states y at ¢’ and x at ¢” for the 
action S as (x =x;, x’ =x0): 


bee |Flyeds=Lim fo f 


Xx (0 UF (x0, X40 X35) 


dx j-3 


4 et dx 
rexp] - D> S(xinn x) foe, “y—-- dx ;. (39) 
h i=o A 


In the limit «0, F isa functional of the path x(t). 
We shall see presently why such quantities are 
important. It will be easier to understand if we 
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stop for a moment to find out what the quantities 
correspond to in conventional notation. Suppose 
F is simply x, where & corresponds to,some time 
t=t,. Then on the right-hand side of (39) the 
integrals fron: xo to x,-1 may be performed to 
produce y(xz, £) or exp[ —7(t—t)H/h pe. In like 
manner the integrals on x; for j2i>k give 
x* (xe, #) or fexp[—i(t’’—t)H/A]xe-}*. Thus, the 
transition elcifent of x;, 


(xi | Fl peds 


= fete cin Mer— rge INH Oud 


= free, thxplx, dx (40) 


is the matrix element of x at tinie t=¢ between 
the state which would develop at time ¢ from pe 
at ¢’ and the state which will develop from tine ¢ 
to xe até”. It is; therefore, the matrix element 
of x(t) between these states. 

Likewise, according to (39) with F=x,41, the 
transition element of x,,1 is the matrix element of 


x(t-+ e). The transition element of F= (xz41 ~x1)/€ 


is the matrix element of (x(¢+«)—x(t))/e or of 
i(Hx—xH)/h, as is easily shown from (40). We 
can call this the matrix element of velocity 2(é). 

Suppose we consider a second problem which 
differs from the first because, for example, the 
potential isaugmented bya small amount U(, x#). 
Then in the new problem the quantity replacing 
Sis S=S4+¥o; U(x, t,). Substitution into (38) 
leads directly to 


(xer] 1] weds 


= (xe be) (41) 


Thus, transition elements such as (39) are im- 
portant insofar as F may arise in soine way from 
a change 6S in an action expression. We denote, 
by observable functionals, those functionals F 
which can be defined, (possibly indirectly) in 
terms of the changes which are produced by 
possible changes in the action S. The condition 
that a functional be observable is somewhat 
similar to the condition that an operator be 
Hermitian. The observable functionals are a 


te i 
exp— > U(x, .) 
h wl 
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restricted class because the action must remain a 
quadratic function of velocities. From one ob- 
servable functional others may be derived, for 
example, by 


(xa | Fl bese 


= (xe 


which is obtained from (39). 

Incidentally, (41) leads directly to an im- 
portant perturbation formula. If the effect of U 
is small the exponential can be expanded to first 
order in U and we find 


(xe (Peds: =(xerl i lbeds 


te 5 
F se x U(x, ty) 
tal 


ve) (42) 
o 


4x0 LE Un ble) (43) 
1 t 


Of particular importance is the case that x. is a 
state in which yy would not be found at all were 


it not for the disturbance, U (ie, (xer|1| Pes 
=) Then 
1 ‘ 
peed EU (xs, te) |purds!? (44) 
h i 


is the probability of transition as induced to first 
order by the perturbation, In ordinary notation, 


xe] eU (xi, bi) |Wur)s 


=f | faemterormmer-otte- com 1 dx dl 


so that (44) reduces to the usual expression!” for 
time dependent perturbations. 


9. NEWTON’S EQUATIONS 
The Commutation Relation 


In this section we find that differeut func- 
tionals may give identical results when taken 
between any two states. This equivalence be- 
tween functionals is the statement of operator 
equations in the new language. 

If F depends on the various coordinates, we 
can, of course, define a new functional dF/dx; 

17P A.M. Dirac, The Princtples of Quantum Mechanics 


(The Clarendon Press, Oxford, 1935), second edition, 
Section 47, Eq. (20). 
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by differentiating it with respect to one of its 
variables, say x(O<k<j). If we calculate 
(xe |OF/dx«| Yes by (39) the integral on the 
right-hand side will contain dF/dx,. The only 
other place that the variable x, appears is in S. 
Thus, the integration on x, can be performed 
by parts. The integrated part vanishes (as- 
suming wave functions vanish at infinity) and we 
are left with the quantity — F(d/dx.) exp(¢S/h) 
in the integral. However, (0/dxx) exp(iS/h) 
= (i/h)(0S/dxx) exp(iS/h), so the right side repre- 
sents the transition element of — (i/h)F(S/dxx), 


Le, 
t 
(xe 7) =— (xe ve) . (45) 
s h s 


This very important relation shows that two 
different functionals may give the same result for 
the transition element between any two states. 
We say they are equivalent and symbolize the 
relation by 


as 
F— 
OXk 


oF 


Ox 


hoF as 
--— < F_, (46) 
1 Ox, & 


the symbol « emphasizing the fact that func- 
s 


tionals equivaleut under one action may not be 
equivalent under another, The quantities in (46) 
need not be observable. The equivalence is, 
nevertheless, true. Making use of (36) one can 
write 


hoF OS(Xe415 OS (xy La 
, Z| (Mea aes (ey Xe 1), (47) 


1 OX S Ox. OX: 
This equation is true to zero and first orcler in ¢€ 
and has as consequences the commutation rela- 
tions of momentum and coordinate, as well as the 
Newtonian equations of motion in matrix form. 
In the case of our simple one-dimensional 
problem, S(xi,1, xs) ts given by the expression 
(15), so that 


OS (xk ¢1, Xe) /OXe = — M(Xeg1— x4) /, 
and 
OS (iy Xe 1) /OXe = (Xe —X 4-1) /E— EV" (xe) 5 


where we write V’(x) for the derivative of the 
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potential, or force. Then (47) becomes 


h oF | — 
= o m 
t ox, § € 


NET Ng 
a) ad ven, (48) 


€ 


If F does not depend on the variable x, this gives 
Newton's equations of motion. For example, if F 
is constant, say unity, (48) just gives (dividing 


by «) 
NAT Xk 
) VW" (xa). 


mL Xeqi 7 Xe 
00 — 
8 € € 


€ 


Thus, the transition element of mass times accel- 
eration [(xz41—x«)/e~ (xe—Xe-1)/€]/e between 
any two states is equal to the transition element 
of force — V’(x,) between the same states. This 
is the matrix expression of Newton's law which 
holds in quantum niechanics. 

What happens if F does depeud upon x? For 
exaniple, let F=x,. Then (48) gives, since 
OF /dxp.= 1 ’ 


h Xhz~ 7X Xe Xp-n 
art tk ~n(=— =) = "¢a9 | 
€ 


t € 


or, neglecting terms of order e, 


X41 Xy Nam MRL h 
Mt —— BM, —— BY) > me 
€ € 2 


In order to transfer an equation such as (49) into 
conventional notation, we shall have to discover 
what matrix corresponds to a quantity such as 
XeXeg1. It is clear from a study of (39) that if Fis 
set equal to, say, f(xx)g(xs41), the corresponding 
operator in (40) is 


(49) 


en li (OH (x) e~ Ol) H f(x)e ~ C1) Ce (OH 


the matrix element being takeu between the 
States x.’ and ye. The operators corresponding 
to functions of x41 will appear to the left of the 
aperators corresponding to functions of xx, i.e, 
the order of terms in a matrix operator product 
corresponds to an order in time of the corresponding 
factors in a functional, Thus, if the functional can 
and is written in stich a way that tn each term 
factors corresponding to later times appear to the 
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left of factors corresponding to earlier terms, the 
corresponding operator can immediately be 
written down if the order of the operators is kept 
the same as in the functional.'® Obviously, the 
order of factors in a functional is of no conse- 
quence. The ordering just facilitates translation 
into conventional operator notation. To write 
Eq. (49) in the way desired for easy translation 
would require the factors in the second term on 
the left to be reversed in order. We see, therefore, 
that it corresponds to 


px—xp=h/i 


where we have written p for the operator mx. 

The relation between functionals and the 
corresponding operators is defined above in terms 
of the order of the factors in time. It should be 
remarked that this rule must be especially care- 
fully adhered to when quantities involving veloci- 
ties or higher derivatives are involved. The cor- 
rect functional to represent the operator (z)? is 
actually (xn41—24)/e-(xe—2e-)/e rather than 
C(xr41—4)/e]?. The latter quantity diverges as 
1/¢ as e—0. This may be seen by replacing the 
second term in (49) by its value xx41-m(xn41 — x4) /e 
calculated an instant ¢ later in time. This does 
not change the equation to zero order in ¢. We 
then obtain (dividing by e) 


(“= 7 h 
— pa eee 

€ ) Sime 
This gives the result expressed earlier that the 
root mean square of the “‘velocity”” (x41—xx)/e 
between two successive positions of the path is 
of order «7. 


It will not do then to write the functional for 
kinetic energy, say, simply as 


dm (x41 — xe) /e? (51) 


for this quantity is infinite as e—0. In fact, it is 
not an observable functional. 

One can obtain the kinetic energy as an ob- 
servable functional by considering the first-order 
change in transition amplitude occasioned by a 
change in the mass of the particle. Let m be 
changed to m(1+5) for a short time, say e, around 
t,. The change in the action is 4dem[(x.41—xn)/e 2? 


(50) 


18 Dirac has also studied o 


) s rators containing quantities 
referring to different times. 
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the derivative of which gives an expression like 
(51). But the change in m changes the normaliza- 
tion constant 1/A corresponding to dx, as 
well as the action. The constant is changed 
from (2zhet/m)-? to (2rhet/m(1+6)) or by 
45(2rhei/m)-} to first order in 6. The total effect 
of the change in mass in Eq. (38) to the first 
order in 6 is 


(xerr | gbetm[ (<n4.1 — Xe) /e P/A+45| pe). 


We expect the change of order 6 lasting for a time 
¢ to be of order de. Hence, dividing by det/h, we 
can define the kinetic energy functional as 


K.E. = 4m[(xnt1— x0) /e P+A/2et. 


This is finite as e—0 in view of (50). By making 
use of an equation which results from substituting 
m(xe41—2x%)/€ for F in (48) we can also show that 
the expression (52) is equal (to order e) to 


KE =im(“"—) (=), (53) 


That is, the easiest way to produce observable 
functionals involving powers of the velocities is 
to replace these powers by a product of velocities, 
each factor of which is taken ata slightly different 
time. 


(52) 


10. THE HAMILTONIAN 
Momentum 


The Hamiltonian operator is of central im- 
portance in the usual formulation of quantum 
mechanics. We shall study in this section the 
functional corresponding to this operator. We 
could immediately define the Hamiltonian func- 
tional by adding the kinetic energy functional 
(52) or (53) to the potential energy. This method 
is artificial and does not exhibit the important 
relationship of the Hamiltonian to time. We shall 
define the Hamiltonian functional by the changes 
made in a state when it is displaced in time. 

To do this we shall have to digress a moment to 
point out that the subdivision of time into equal 
intervals is not necessary. Clearly, any subdivi- 
sion into instants ¢; will be satisfactory; the 
limits are to be taken as the largest spacing, 
ti41—¢,, approaches zero. The total action S must 
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now be represented as a sum 


S= Lo Seip, b415 Xa be), (54) 


where 


Stern tarsantd=f LEW, x@)at, (85) 


ts 


the integral being taken along the classical path 
between x; at ¢; and x,4, at ¢:41. For the simple 
one-dimensional example this becomes, with 
sufficient accuracy, 


SK tigas Mis 2) 


fF —xr-\2 
a {=(=—) Witty 
2X big mts 

the corresponding normalization constant for 

integration on dx, is A = (2rhi(t;41—¢,)/m)7. 
The relation of H to the change in a state with 
displacement in time can now be studied. Con- 
sider a state y(t) defined by a space-time region 
R'. Now imagine that we consider another state 
at time /, y(t), defined by another region R,’. 
Suppose the region Rj’ is exactly the same as R’ 
except that it is earlier by a time 4, i.e., displaced 
bodily toward the past by a time 6. All the 
apparatus to prepare the system for R,’ is 
identical to that for R’ but is operated a time 6 
sooner. If L depends explicitly on time, it, too, is 
to be displaced, i.e., the state y, is obtained from 
the L used for state y except that the time ¢ in L, 
is replaced by t+-6. We ask how does the state ys 
differ from y? In any measurement the chance of 
finding the system in a fixed region R” is different 
for R’ and R,’. Consider the change in the 
transition element (x|1|¥s)ss produced by the 
shift 6. We can consider this shift as effected by 
decreasing all values of ¢; by 6 for i< Rand leaving 
all ¢; fixed for 1>k, where the time ¢ lies in the 
interval between 441 and &.'®° This change will 
have no effect on S(x s41, £1413 4, £4) as defined by 
(55) as long as both ¢,41 and ¢; are changed by the 
same amount. On the other hand, S(xx41, te+13%k) tk) 


(titi); (56) 


From the point of view of mathematical rigor, if 8 is 
finite, as e>0 one gets into difficulty in that, for example, 
the interval t’,1—t is kept finite. This can be straightened 
out by assuming 6 to vary with time and to be turned on 
smoothly before tt and turned off smoothly after t=t. 
Then keeping the time variation of & fixed, let «+0. Then 
seek the first-order change as 6—+0. The result is essentially 
the same as that of the crude procedure used above. 
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is changed to S(xe41, tei; Xe, te — 8). The constant 
1/A for the integration on dx, is also altered to 
(Qrhiltisi—te+5)/m)-. The effect of these 
changes on the transition element is given to the 
first order in 8 by 


16 
(x1 V)s—(x| 1 ]va) s9= (x1 Hal v) 5, (57) 


here the Hamiltonian functional Hy is defined 
by 
OS (xe+1, tetas Xk, tk) h 


A= oo (58) 
Oty 2i (ter — te) 


The last term is due to the change in 1/A and 
serves to keep H; finite as e—0. For example, for 
the expression (56) this becomes 


mM [Xk+1—- Xk 2 h 
N= ( ) + +V (x01), 
2\ tart Qi(tess —te) 


which is just the sum of the kinetic energy func- 
tional (52) and that of the potential energy 
V(xn41)- 

The wave function (x, ¢) represents, of course, 
the same state as (x, ¢) will be after time 4, i.e., 
v(x, ¢+6). Hence, (57) is intimately related to the 
operator equation (31). 

One could also consider changes occasioned by 
a time shift in the final state x. Of course, nothing 
new results in this way for it is only the relative 
shift of x and y which counts. One obtains an 
alternative expression 


OS (xt15 tetrj Xr te) 


Olney 


W.=- (59) 


24 (tea — te) , 


This differs from (58) only by terms of order e. 
The time rate of change of a functional can be 
computed by considering the effect of shifting 
both initial and final state together. This has the 
same effect as calculating the transition element 
of the functional referring to a later time. What 
results is the analog of the operator equation 


h. 
-f =Hf—fH. 
4 


The montentum functional p, can be defined in 
an analagous way by considering the changes 
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made by displacements of position : 
1A 
(xl 1)s— (x11 i¥a)sa=—(xl Pel ¥)s. 


The state ¥, is prepared from a region Ra’ which 
is identical to region R’ except that it is moved a 
distance A in space. (The Lagrangian, if it de- 
pends explicitly on x, must be altered to 
La=L(t,x—A) for times previous to ¢.) One 
finds?® 


OS(Xepa, Xe) OS(Xer1y Xx) 


(60) 


Px 


OXn+1 OX, 


Since a(x, ¢) is equal to y(x—A, t), the close con- 
nection between ~, and the x-derivative of the 
wave function is established. 

Angular moinentum operators are related in an 
analogous way to rotations. 

The derivative with respect to fi: of 
S(xs41, Ci¢a3 05, £:) appears in the definition of H;. 
The derivative with respect to x41 defines p;. 
But the derivative with respect to ti+: of 
S(x itn, lita} Xs, ¢;) is related to the derivative 
with respect to x 441, for the function S(x 41, ¢i413 
xi,t;) defined by (55) satisfies the Hamilton- 
Jacobi equation. Thus, the Hamilton-Jacobi 
equation is an equation expressing H; in terms of 
the p;. In other words, it expresses the fact that 
time displacements of states are related to space 
displacements of the same states. This idea leads 
directly to a derivation of the Schroedinger 
equation which is far more elegant thar «in: one 
exhibited in deriving Eq. (30). 


11. INADEQUACIES OF THE FORMULATION 


The formulation given here suffers from a seri- 
ous drawback. The mathematical concepts needed 
are new. At present, it requires an unnatural and 
cumbersome subdivision of the time interval to 
make the meaning of the equations clear. Con- 
siderable improvement can be made through the 
use of the notation and concepts of the mathe- 
matics of functionals. However, it was thought 
best to avoid this in a first presentation. One 

2% We did not immediately substitute p; from (60) into 
(47) because (47) would then no longer have been valid to 
both zero order and the first order in «. We could derive 
the commutation relations, but not the equations of 
motion. The two expressions in (60) represent the momenta 


at each end of the interval ¢; to441. They differ by - 1" (xe41) 
because of the force acting during the time «. 
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needs, in addition, an appropriate measure for 
the space of the argument functions x(/) of the 
functionals.'® 

It is also incomplete from the physical stand- 
point. One of the most important characteristics 
of quantum mechanics is its invariance under 
unitary transformations. These correspond to the 
canonical transformations of classical mechanics. 
Of course, the present formulation, being equiva- 
lent to ordinary formulations, can be mathe- 
matically demonstrated to be invariant under 
these transformations. However, it has not been 
formulated in such a way that it is phystcally 
obvious that it is invariant. This incompleteness 
shows itself in a definite way. No direct procedure 
has been outlined to describe measurements of 
quantities other than position. Measurements of 
momentum, for example, of one particle, can be 
defined in terms of measurements of positions of 
other particles. The result of the analysis of such 
a situation does show the connection of mo- 
mentum measurements to the Fourier transform 
of the wave function. But this is a rather rounda- 
bout method to obtain such an important 
physical result. It is to be expected that the 
postulates can be generalized by the replacement 
of the idea of ‘‘paths in a region of space-time R”’ 
to ‘paths of class R,” or ‘paths having property 
R.”” But which properties correspond to which 
physical measurements has not been formulated 
in a general way. 


12. A POSSIBLE GENERALIZATION 


The formulation suggests an obvious generali- 
zation. There are interesting classical problems 
which satisfy a principle of least action but for 
which the action cannot be written as an integral 
of a function of positions and velocities. The 
action may involve accelerations, for example. 
Or, again, if interactions are not instantaneous, it 
may involve the product of coordinates at two 
different times, such as fx(/)x(t+T)dt. The 
action, then, cannot be broken up into a sum of 
small contributions as in (10). As a consequence, 
no wave function is available to describe a state. 
Nevertheless, a transition probability can be de- 
fined for getting from a region R’ into another 
R". Most of the theory of the transition elements 
(xe | Flees can be carried over. One simply 
invents a symbol, such as (R”|F|R’)s by an 
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equation such as (39) but with the expressions 
(19) and (20) for y and x substituted, and 
the more general action substituted for S. 
Hamiltonian and momentum functionals can be 
defined as in section (10). Further details may be 
found in a thesis by the author.*! 


13. APPLICATION TO ELIMINATE 
FIELD OSCILLATORS 


One characteristic of the present formulation is 
that it can give one a sort of bird’s-eye view of 
the space-time relationships in a given situation. 
Before the integrations on the x; are performed in 
an expression such as (39) one has a sort of 
format into which various F functionals may be 
inserted. One can study how what goes on in the 
quantum-niechanical system at different times is 
interrelated, ‘To inake these vague remarks some- 
what more definite, we discuss an example. 

In classical electrodynamics the fields de- 
scribing, for instance, the interaction of two 
particles can be represented as a set of oscillators, 
The equations of motion of these oscillators may 
be solved and the oscillators essentially elimi- 
nated (Lienard and Wiechert potentials). The 
interactions which result involve relationships of 
the motion of one particle at one time, and of the 
other particle at another time. In quantum 
electrodynamics the field is again represented as a 
set of oscillators, But the motion of the oscillators 
cannot be worked out and the oscillators elimi- 
nated. It is true that the oscillators representing 
longitudinal waves may be eliminated. The result 
is instantaneous electrostatic interaction. The 
electrostatic elimination is very instructive as it 
shows up the difficulty of self-interaction very 
distinctly. In fact, it shows it up so clearly that 
there is no ambiguity in deciding what term is 
incorrect and should be omitted. This entire 
process is not relativistically invariant, nor is the 
omitted term, It would seem to be very desirable 
if the oscillators, representing transverse waves, 

%The theory of electromagnetism described by J. A. 
Wheeler and R. P. Feynman, Rev. Mod. Phys. 17, 157 
(1945) can be expressed in a principle of least action in- 
volving the coordinates of particles alone. It was an 
attempt to quantize this theory, without reference to the 
fields, which led the author to study the formulation of 
quantum mechanics given here. The extension of the 
ideas to cover the case of more general action functions 
was developed in his Ph.D. thesis, ‘‘The principle of least 


action in quantum mechanics” submitted to Princeton 
University, 1942. 
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could also be eliminated. This presents an almost 
insurmountable problem in the conventional 
quantum mechanics. We expect that the motion 
of a particle a@ at one time depends upon the 
motion of 6 at a previous time, and vice versa. A 
wave function W(%a,xu;t), however, can only 
describe the behavior of both particles at one 
time. ‘There is no way to keep track of what 6 did 
in the past in order to determine the behavior of 
a. The only way is to specify the state of the set 
of oscillators at ¢, which serve to ‘‘remember” 
what b (and a) had been doing. 

The present formulation permits the solution 
of the motion of all the oscillators and their com- 
plete elimination from the equations describing 
the particles. This is easily done. One must 
simply solve for the motion of the oscillators be- 
fore one integrates over the various variables x; 
for the particles. It is the integration over x, 
which tries to condense the past history into a 
single state function. This we wish to avoid. Of 
course, the result depends upon the initial and 
final states of the oscillator. If they are specified, 
the result is an equation for (xy | 1] Wy) like (38), 
but containing as a factor, besides exp(tS/h) 
another functional G depending only on the 
coordinates describing the paths of the particles. 

We illustrate briefly how this is done in a very 
simple case. Suppose a particle, coordinate x(#), 
Lagrangian L(#, x) interacts with an oscillator, 
coordinate g(t), Lagrangian 4(q?—w?g*), through 
a term y(x,4)g(t) in the Lagrangian for the 
system, Here y(x,t) is any function of the 
coordinate x(t) of the particle and the time.” 
Suppose we desire the probability of a transition 
from a state at time ¢’, in which the particle’s 
wave function is yy and the oscillator is in energy 
level 2, to a state at ¢’’ with the particle in x; 
and oscillator in level m. This is the square of 


(xergm| 1 |Weren) so4s0+Sr 


=f +f ont(adxettes 


4 
x exp (3s + So+Si1) We (0) (Go) 


dxo dqo dX j-1 dqj-1 
Aa A a 


2 The generalization to the case that y depends on the 
velocity, £, of the particle presents no problem. 


(61) 


ax jdq j. 
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Here ¢,‘,) is the wave function for the oscillator 
in state n, S, is the action 


q-l 
x Spa Xd) 
ol 


calculated for the particle as though the oscillator 
were abseit, 


. EM e/g | 2 ew" 
So= = [; farts *) ae 
imo L2 € 2 


that of the oscillator alone, and 


jo} 
S= Lyai 


i= 


(where y;= (x3, #)) is the action of interaction 
between the particle and the oscillator. The 
normalizing constant, @, for the oscillator is 
(27 ei/h)-'. Now the exponential depends quad- 
ratically upon all the g;. Hence, the integrations 
over all the variables g, for 0<i<j can easily be 
performed. One is integrating a sequence of 
Gaussian integrals. 

The result of these integrations is, writing 
T=t"!—t', (QrthsinwT/w)— expi(S p+ Q(gs,90))/h, 
where Q(q,, go) turns out to be just the classical 
action for the forced harmonic oscillator (see 
reference 15). Explicitly it is 


Qi Go) = [cosatyar-+a0 — 2990 


2 sinwT! 


we 


2g0 : 
+-— f y(t) sinw(t—2")dt 
Ww ve 


ee 


2 Ps 
i f y(t) sinw(t’’ —2)dt 


wy 


2 pt’ pt 
a? ; J y(t)y(s) sinw(t” — 1) 


Xsinw(s— vasa] 


It has been written as though y(¢) were a continu- 
ous function of time. The integrals really should 
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be split into Riemann sums and the quantity 
(x1, £;) substituted for y(¢;). Thus, @ depends on 
the coordinates of the particle at all times through 
the y(x,, ¢,) and on that of the oscillator at times 
t and ¢” only. Thus, the quantity (61) becomes 


(xi gul Li Pereuds, psytsy -f- : «fet NGan 


Sy dxy  dxj-1 
xexp(=") yet" sede; 
h A A 


= (x01 GnnlWe) sy 


which now contains the coordinates of the 
particle only, the quantity G,,,, being given by 


Gun = (Qrih sinwT /w)-} f f on (Q)) 


Xexp(tQ(qj, Go)/h) on(Go)dg dgo. 


Proceeding in an analogous manner one finds 
that all of the oscillators of the electromagnetic 
field can be eliminated from a description of the 
motion of the charges. 


14, STATISTICAL MECHANICS 
Spin and Relativity 


Problems in the theory of measurement and 
statistical quantum mechanics are often simpli- 
fied when set up from the point of view described 
here. For example, the influence of a perturbing 
measuring instrument can be integrated out in 
principle as we did in detail for the oscillator. The 
statistical density matrix has a fairly obvious and 
useful generalization. It results from considering 
the square of (38). It is an expression similar to 
(38) but containing integrations over two sets of 
variables dx; and dx,’. The exponential is re- 
placed by expi(S—S’)/h, where S’ is the same 
function of the x,’ as S is of x;. It is required, for 
example, to describe the result of the elimination 
of the field oscillators where, say, the final state of 
the oscillators is unspecified and one desires only 
the sum over all final states m. 

Spin may be included in a formal way. The 
Pauli spin equation can be obtained in this way: 
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One replaces the vector potential interaction 
term in S(xs+1, X43), 


e e 
(Xi 1 —¥,)+A(K)) $—-K1 —X,) AK) 
2¢ 2c 
arising fron expression (13) by the expression 
é 
(a: (Xi41 ~-X,))(o- A(x,)) 
2c 
e 
Te ae 1))(@+ (Xi41—X,)). 
c 


Here A is the vector potential, x41 and x; the 
vector positions of a particle at times ¢:41 and ¢; 
and ¢ is Pauli’s spin vector matrix. The quantity 
& must now be expressed as []; expt S(xi41, %3)/h 
for this differs from the exponential of the sum of 
S(%i41, ¥:). Thus, @ is now a spin matrix. 

The Klein Gordon relativistic equation can 
also be obtained formally by adding a fourth 
coordinate to specify a path. One considers a 
“path as being specified by four functions 
x(r) of a parameter r. The parameter 7 now 
goes in steps ¢ as the variable ¢ went previously. 
The quantities x((¢), x@(¢), x(t) are the space 
coordinates of a particle and x(#) is a corre- 


sponding time. The Lagrangian used is 
4 
Li! (der/dr)? + (e/c) (dx*/dr)A,], 
pol 


where A, ts the 4-vector potential and the terms 
in the sum for n=1, 2, 3 are taken with reversed 
sign. 1f one seeks a wave function which depends 
upon +r periodically, oue can show this inust 
satisfy the Klein Gordon equation, ‘Phe Dirac 
equation results froni a modification of the 
Lagrangian used for the Klein Gordon equation, 
which is analagous to the modification of the 
non-relativistic Lagrangian required for the 
Pauli equation. What results directly is the 
square of the usual Dirac operator. 

These results for spin and relativity are purely 
formal and add nothing to the understanding of 
these equations. There are other ways of ob- 
taining the Dirac equation which offer some 
promise of giving a clearer physical interpretation 
to that important and beautiful equation. 

The author sincerely appreciates the helpful 
advice of Professor and Mrs. H.C. Corben and of 
Professor tI. A. Bethe. Ile wishes to thank 
Professor J. A. Wheeler for very many discussions 
during the early stages of the work. 
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Mathematical Formulation of the Quantum Theory of Electromagnetic Interaction 


R. P. Feynman* 
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The validity of the rules given in previous papers for the solution of problems in quantum electrodynamics 
is established. Starting with Fermi’s formulation of the field as a set of harmonic oscillators, the effect of 
the oscillators ts integrated out in the Lagrangian form of quantum mechanics. There results an expression 
for the effect of all virtual photons valid to all orders in e/he. It is shown that evaluation of this expression 
as a power series in e?/he gives just the terms expected by the aforementioned rules. 

In addition, a relation is established between the amplitude for a given process in an arbitrary unquantized 
potential and in a quantum clectrodynamical:field. This relation permits a simple general statement of 


the laws of quantum electrodynamics. 


A description, in Lagrangian quantum-mechanical form, of particles satisfying the Klein-Gordon equation 
is given in an Appendix. L¢ involves the use of an extra parameter analogous to proper time to describe 


the Lrajectory of the particle in four dimensions. 


A second Appendix discusses, in the special case of photons, the problem of finding what real processes 


are implied by the formula for virtual processes. 


Problems of the divergences of electrodynamics are not discussed. 


1, INTRODUCTION 


N two previous papers’ rules were given for the 
calculation of the matrix element for any process in 

electrodynamics, to each order in e/hc. No complete 
proof of the equivalence of these rules to the conven- 
tional electrodynamics was given in these papers. 
Secondly, no closed expression was given valid to all 
orders in e*/Rc. In this paper these formal omissions 
will be remedied? 

In paper II it was pointed out that for many prob- 
lems in electrodynamics the Hamiltonian method is not 
advantageous, and might be replaced by the over-all 
space-time point of view of a direct particle interaction. 
It was also mentioned that the Lagrangian form of 
quantum mechanics? was useful in this connection. The 
rules given in paper II were, in fact, first deduced in 
this form of quantum mechanics. We shall give this 
derivation here. 

The advantage of a Lagrangian form of quantum 
mechanics is that in a system with interacting parts it 
permits a separation of the problem such that the 
motion of any part can be analyzed or solved first, and 
the results of this solution may then be used in the 
solution of the motion of the other parts. This separa- 
tion is especially useful in quantum electrodynamics 
which represents the interaction of matter with the 
electromagnetic field. The electromagnetic field is an 
especially simple system and its behavior can be 
analyzed completely. What we shall show is that the 


*Now at the California Institute of Technology, Pasadena, 
California. 

VR, P, Feynman, Phys. Rev. 76, 749 (1949), hereafter called I, 
and Phys. Rev. 76, 769 (1949), hereafter called IL 

2Sce in this connection also the papers of S. Tomonaga, Phys. 
Rev. 74, 224 (1948); S. Kanesawa and S. Tomonaga, Prog. 
Theoret. Phys. 3, 101 (1948); J. Schwinger, Phys. Rev. 76, 790 
(1949); F. Dyson, Phys. Rev. 75, 1736 (1949)¢ W. Pauli and 
F. Villars, Rey, Mod. Phys. 21, 434 (1949). The papers cited 
give references to previous work. 

3 Bede Feynman, Rev. Mod. Phys. 20, 367 (1948), hereafter 
called C. 


net effect of the field is a delayed interaction of the 
particles. It is possible to do this easily only if it is not 
necessary at the same time to analyze completely the 
motion of the particles. ‘he only advantage in our 
problems of the form of quantum mechanics in C is to 
permit one to separate these aspects of the problem. 
There are a number of disadvantages, however, such as 
a lack of familiarity, the apparent (but not real) 
necessity for dealing with matter in non-relativistic 
approximation, and at times a cumbersome mathe- 
matical notation and metliod, as well as the fact that 
a great deal of useful information that is known about 
operators cannot be directly applied. 

It is also possible to separate the field and particle 
aspects of a problem m a manner which uses operators 
and Hamiltonians in a way that is much more familiar. 
One abandons the notation that the order of action of 
operators depends on their written position on the paper 
and substitutes some other convention (such that the 
order of operators is that of the time to which they 
tefer). The increase in manipulative facility which 
accompanies this change in notation makes it easier to 
represent and to analyze the formal problems in electro- 
dynamics. The method requires some discussion, how- 
ever, and will be described in a succeeding paper. In 
this paper we shall give the derivations of the formulas 
of II by means of the form of quantum mechanics 
given in C. 

The problem of interaction of matter and field will be 
analyzed by first solving for the behavior of the field in 
terms of the coordinates of the matter, and finally 
discussing the behavior of the matter (by matter is 
actually meant the electrons and positrons). That is to 
say, we shall first eliminate the field variables from the 
equations of motion of the electrons and then discuss 
the behavior of the electrons. In this way all of the 
rules given in the paper II will be derived. 

Actually, the straightforward elimination of the field 


440 


THEORY OF ELECTROMAGNETIC INTERACTION 


variables will lead at first to an expression for the 
behavior of an arbitrary number of Dirac electrons. 
Since the number of electrons might be infinite, this 
can be used directly to find the behavior of the electrons 
according to hole theory by imagining that nearly all 
the negative energy states are occupied by electrons. 
But, at least in the case of motion in a fixed potential, 
it has been shown that this hole theory picture is 
equivalent to one in which a positron is represented as 
an electron whose space-time trajectory has had its 
time direction reversed. To show that this same picture 
may be used in quantum electrodynamics when the 
potentials are not fixed, a special argument is made 
based on a study of the relationship of quantum clectro- 
dynamics to motion in a fixed potential. Finally, it is 
pointed out that this relationship is quite general and 
might be uscd for a gencral statement of the laws of 
quantuin electrodynamics. 

Charges obeying the Klcin-Gordon equation can be 
analyzed by a special formalism given in Appendix A. 
A fifth parameter is used to specify the four-dimensional 
trajectory so that the Lagrangian form of quantum 
mechanics can be used. Appendix B discusses in more 
detail the relation of real and virtual photon emission. 
An equation for the propagation of a self-interacting 
electron is given in Appendix C. 

In the demonstration which follows we shall restrict 
ourselves temporarily to cases in which ‘the particle’s 
motion is non-relativistic, but the transition of the final 
formulas to the relativistic case is direct, and the proof 
could have been kept relativistic throughout. 

The transverse part of the clectromagnetic field will 
be represented as an assemblage of independent har- 
monic oscillators each interacting with the particles, 
as suggested by I'ermi.! We usc the notation of Heitler.’ 


2, QUANTUM ELECTRODYNAMICS IN 
LAGRANGIAN FORM 


The Hamiltonian for a set of non-relativistic particles 
interacting with radiation is, classically, W=H,+H 
+ AM 1, where H+ M1 = Son Ft (pam enA’(xn))* 
is the Hamiltonian of the particles of mass m,, charge 
€,, coordinate x, and momeutum p, and their inter- 
action with the transverse part of the electromagnetic 
field. ‘Tlis field can be expanded into plane waves 


A" (x) = (82)'Sox[ei(gx™ cos(K-x)+¢x sin(K-x)) 
+e2(g¢x cos(K-x)+9x™ sin(K-x))] (1) 


where e: and e» are two orthogonal polarization vectors 
at right angles to the propagation vector K, magnitude 
k. The sum over K means, if normalized to unit volume, 
4 fd*K/82r3, and each gx‘ can be considered as the 
coordinate of a harmonic oscillator. (The factor } arises 
for the mode corresponding to K and tu —K is the 


‘E. Fermi, Rev, Mod. Phys. 4, 87 (1932). 
BW, Heiter, The Quantum Theory of Radiation, second ectition 
(Oxford University Press, London, 1944). 


441 


same ) The Hamiltonian of the transverse field repre- 
sented as oscillators is 


es = X (put) 4° (9e!)?) 
2K r=t 
where px” is the momentum conjugate to gx”. The 
longitudinal part of the field has been replaced by the 
Coulomb interaction,® 


He=3 on Dom €n€m/T nm 


where ran?=(X,—Xm)®. As is well known,’ when this 
Hamiltonian is quantized one arrives at the usual 
theory of quantum electrodynamics. To express these 
laws of quantum electrodynamics one can equally well 
use the Lagrangian form of quantum mechanics to 
describe this set of oscillators and particles. The 
classical Lagrangian equivalent to this Hamiltonian is 
L=Llit+Lhit+Lh.+Le where 


Lp= Fin MaX'n? (2a) 
Lr= Don enX'n AM (Xn) (2b) 

= $x D+((Gu?)- B(9K™)) (2c) 
Le= 4500 Dom Cn€m/Pmne (2d) 


When this Lagrangian is used in the Lagrangian 
forms of quantum mechanics of C, what it leads to is, 
of course, mathematically equivalent to the result of 
using the Hamiltonian /7 in the ordinary way, and is 
therefore equivalent to the more usual forms of quantum 
electrodynamics (at least for non-relativistic particles). 
We may, therefore, proceed by using this Lagrangian 
form of quantum electrodynamics, with the assurance 
that the results obtained must agrec with those obtained 
from the more usual Hamiltonian form. 

The Lagrangian enters through the statement that 
the functional which carries the system from one state 
to another is exp(#S) where 


Sm f Ld S,+S14S-4S. (3) 


The time integrals must be written as Riemann sums 
with some care; for example, 


<5 f eax’ (1) “AM(Xq(1))dt (4) 


becomes according to C, Eq. (19) 
Sp= Don Des hen (Kn, ch Xn, i) (AM (Kn, FAY (Xa, 5) (5) 


so that the velocity x',,; which multiplics A‘’(x,,,) is 
Xn, tbe (Xa, i Xn, v1) (6) 


* The term in the sum for #=m is obviously infinite but must 
be included for relativistic invariance. Our problem here is to 
re-express the usual (and divergent) form of electrodynamics in 
the form given in II. Modifications for dealing with the diver- 

ences are discussed in II and we shalt not discuss them further 
ere, 


Xn = FEN (Kn, He 
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In the Lagrangian form it is possible to eliminate the 
transverse oscillators as is discussed in C, Section 13. 
One must specify, however, the initial and final state of 
all oscillators. We shall first choose the special, simple 
case that all oscillators are in their ground states initially 
and finally, so that all photons are virtual. Later we 
do ihe more general case in which real quanta are 
present initially or finally. We ask, then, for the ampli- 
tude for finding no quanta preseut and the particles in 
state xy at time ?’, if at time ¢’ the particles were in 
stalc , and no quanta were present. 

The method of eliminating field oscillators is de- 
scribed in Section 13 of C. We shall simply carry out the 
elimination lere using the notation and equations of C. 
To do this, for simplicity, we first consider in the next 
seclion the case of a particle or a system of particles 
interacting with a single oscillator, rather than the 
entire assemblage of the electromagnetic field. 


3, FORCED HARMONIC OSCILLATOR 


We consider a harmonic oscillator, coordinate q, 
Lagrangian L=3(g?~w°*q?) interacting with a particle 
or system of particles, action S,, through a term in the 
Lagrangian g()y() where y(i) is a function of the 
coordinates (symbolized as x) of the particle. The 
precise form of y(t) for cach oscillator of the clectro- 
magnetic field is given in the next section. We ask for 
the amplitude that at some time /”’ the particles arc in 
state x: and the oscillator is in, say, an eigenstate m 
of energy w(m-+4) (units are chosen such that #=c= 1) 
when it is given that at a previous time ?’ the particles 
were in state ¥, and the oscillator in x. The amplitude 
for this is the transition amplitude [see C, Eq. (61)] 


(xe em 1 | Ye en) Sp 50-487 
a f f xer*(ee)eon* (que) expilSp+So+S2) 


en(qe bu (xe ddaedxvdgedguDx)Dy(t) (7) 


where x represents the variables describing the particle, 
S, is the action calculated classically for the particles 
for a given path going from coordinate ay at f to xy 
at ¢”, Sq is the action {°4(@—w*g?)dl for any path of 
the oscillator going from qy at t’ to qu at 2”, while 


Sia f ay Out, (8) 


the action of interaction, is a functional of both @(é) 
and x(#), the paths of oscillator and particles. The 
symbols Dx(4) and Dg(‘) represent a summation over 
all possible paths of particles and oscillator which go 
between the given end points in the sense defined in C, 
Eq. (9). (That is, assuming ttme to proceed in infini- 
tesimal steps, ¢, an integral over all values of the 
coordinates x and q corresponding to each instant in 
time, suitably normalized.) 
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The problem may be broken in two. The result can 
be written as an integral over all paths of the particles 
only, of (expiS,)-Gna: 


(xe gm| 1 | ber Pn) sp45045r=(xer|Gnalbe)sp (9) 


where Grn is a functional of the path of the particles 
alone (since it depends on y(¢)) given by 


Grn= (em en) 
So 


= f gn (qu) expi(So-+S1enqeddgedgu-Dy(0) 


expt f a()-x(dat 


j~1 
“i f ga* (qi) expied[$e-(g.41— 9.) dog? tard 


i=0 


* Gn go)dgoa~'dgia~'dgz-++a-'dg, (10) 


where we have written the Dy() out explicitly (and 
have set a=(2qie)}, t’—t'=je, qv= Qo, uv =q,) The 
Jast form can be written as 


Gam font EQs "540 Oenlaaddauda; (11) 


where k(q,, 1’; go, f’) is the kernel {as in I, Eq. (2)] for 
a forced harmonic oscillator giving the amplitude for 
arrival at q; at time ¢’’ if at time ¢’ it was known to be 
at go. According to C it is given by 


(95,05 qo, t= (Qmiw-! sinw(t’—0’))-4 
Xexpid(q,, a qos ’) (12) 


where Q(q;, 2; go, #’) is the action calculated along the 
classical path between the end points qj, t’’; go, t’, and 
is given explicitly in C.7 It is 


7™That (12) is correct, at least insofar as it depends on ga, can 
be seen directly as follows, Let 9) be the classical path which 
satisfies the boundary condition ae ae gt") =¢,. Then in the 
integral defining & replace each of the variables g, by 9. =% +4, 
(9,=9(4)), thal is, use the displacement y; front ihe classical 
path 9, as the coordinate rather than the absolute position. 
With the substitution ¢;= 9, +)% in the action 


SotSi= f (d= 4okg!-+ ahd! 


= f UP ota yard + f dit doar 


the terms linear in y drop out by integrations by parts using the 
equation of motion g=— wg +(t) for the classical path, and the 
boundary conditions y(#')=y(?")=0. That this should occur 
should occasion no surprise, for the action functional is an ex- 
lremum at 9(t)=9(t) so thal it will only depend to second order 
in the displacements y from this extremal orbit 9/2). Further, 
since the action functional is quadratic lo begin with, it cannot 
depend on y more than quadratically. Hence 


Sot Si=O+ f Gt dory dt 
so that since dgi=dy;, 


Han 5405) =expliQ) f° exp i f 5(u2— otyndt)Dyt0. 


The factor following the expiQ is the amplitude for a free oscillator 
to proceed from y=0 at /=/' to y=0 at t=?” and does not there- 
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=— [rte cosw(t”—2")— 29,90 
2 sinw(t’’— 2’) 
295 rt” 

4+— y(t) sinw(t—t’)dt 
@w 


v 


2go p'” 
+f y(t) sinw(t"’—bdt 
Quy 


=f" fron sinw(t”—t) 


Xsinw(s— vdsat] (13) 


The solution of the motion of the oscillator can now 
be completed by substituting (12) and (13) into (11) 
and performing the integrals. The simplest case is for 
m, n=0 for which case? 


0(go) = (w/z)! exp(— }wgo?) exp(— diet’) 


so that the integrals on go, g; are just Gaussian integrals. 
There results 


a 
Gumem(—s'f ff eso—tat—oy vi-r(s)atds) 


a result of fundamental importance in the succeeding 
developments. By replacing !~s by its absolute value 
|t—-s| we may integrate both variables over the entire 
range and divide by 2. We will henceforth make the 
results more general by extending the limits on the 
integrals from —© to +. Thus if one wishes to 
study the effect on a particle of interaction with an 
oscillator for just the period ¢’ to /’” one may use 


Gon en(-—f i 


0 —00 


Xexp(—iw|t—s| Yrlt)x(s)aids) (14) 


imagining in this case that the interaction y(t) is zero 
outside these limits. We defer to a later section the 
discussion of other values of m, n. 

Since Goo is simply an exponential, we can write it as 
exp(iZ), consider that the complete “action” for the 
system of particles is S=S,+/ and that one computes 
transition elements with this “‘action’’ instead of Sp 


fore depend on go, gj, or y(t), being a function only of ¢’—#, 
(That tt ts actually (2io~ Vsinw(t’”—1’))~t can be demonstrated 
either by direct integration of the y variables or by using some 
normalizing property of the kernels &, for example that Goo for 
the case y=0 must equal unity.] The expression for Q given in 
C on page 386 is in error, the quantities go and g; should be 
interchanged. 

*It ts most convenient to define the state ya» with the phase 
factor exp[—iw(n+4)/’] and the final state with the faclor 
expL—iw(m+4)/”] so that the results will not depend on the 
particular times #’, /’” chosen. 
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(see C; Sec. 12). The functional /, which is given by 
in 8 i exp(—iw|!—s|)r(s)r(Odsdt (15) 


is complex, however; we shiall speak of it as the complex 
action. It describes the fact that the system at one 
time can affect itself at a different time by means of a 
temporary storage of energy in the oscillator. When 
there are several independent oscillators with different 
interactions, the effect, if they are all in the lowest state 
at t’ and ?”, is the product of their separate Goo contri- 
butions. Thus the complex action is additive, being 
the sum of contributions like (15) for each of the 
several oscillators. 
4. VIRTUAL TRANSITIONS IN THE 
ELECTROMAGNETIC FIELD 

We can now apply these results to eliminate the 
transverse fteld oscillators of the Lagrangian (2). At 
first we can limit ourselves to the case of purely virtual 
transitions in the electromagnetic field, so that there is 
no photon in the field at ¢’ and ¢’, That is, all of the 
field oscillators are making transitions from ground 
state to ground state. 

The yx‘” corresponding to each oscillator gg‘ ts 
found from the interaction term LZ; [Eq. (2b) ], substi- 
tuting the value of A‘"(x) given in (1). There results, 
for example, 


x = (87)!Son eal (e. x'n) cos(K-x,) 
y= (87)!Don en(erX'n) sin(K:x,) 
the corresponding results for yx, yx“ replace e; by ea. 


The complex action resulting from oscillator of 
coordinate gx“ is therefore 


(16) 


oa yy = f fe extn exp(—ik|¢—5|)(er-x'a(d)) 


X (e1°x'm(s)) -cos(K-xn(¢)) cos(K-xXm(s))dsdt. 


The term /x® exchanges the cosines for sines, so in 
the sum /x‘+/«® the product of the two cosines, 
cosA:cosB is replaced by (cosd cosB+sinA sinB) or 
cos(A—B). The terms /k®-+/x™ give the same 
result with e2 replacing e;. The sum (e;:V)(e,:V’) 
+(e2: V)(e2-V’) is (V-V’)—&-*(K- V)(K-V’) since it is 
the sum of the products of vector components in two 
orthogonal directions, so that if we add the product in 
the third direction (that of K) we construct the com- 
plete scalar product. Summing over all K then, since 
DYx=}3/K/87' we find for the total complex action 
of all of the transverse oscillators, 


itn pe 
l-=iLD af ds [ exem exp(—ik{t—s{) 
nm v 


XD a(Q x'm(s)— RK x'n(Q)(K-x'm(5)) J 


-cos(K: (xa(t) —Xm(s)))d?K/8xk. (17) 
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This is to be added to S,+5, to obtain the complete 
action of the system with the oscillators removed. 

The term in (K-x’,(4))(K-x'm(s)) can be simplified 
by integration by parts with respect to ¢ and with 
respect to s [note that exp(—ik]i—s|) has a discon- 
tinuous slope at ¢=s, or break the integration up into 
two regions ]. One finds 


Tip= R= e+Tesansient (18) 


where 


wow 
R=-il>> af as f extn 
nm v v 


Xexp(— ik|f—s| )(1— x n(x’ (5) 
-cosK- (xn(4)— Xm(s))d3K/8x2k (19) 
and 
i=-r>d at f extn 
XcosK: (xa(t)— Xn(¢))d®K/422k? (20) 


comes from the discontinuity in slope of exp(—ik|/—s]|) 
at t=s. Since 


f cos(K: R)d°K/4x2k? = if (ar) sin(br)dh/ w= (27)"! 


0 


this term /, just cancels the Coulomb interaction term 
S.= SL.dl, The term 


T transient = ~LEeven f 
nm 


af 


+exp(— k(t 0’) cosK- (xn()— Xm(2’)) Jd 
+(2k)i[cosK - (xa(t”’) —Xm(t’)) 
+cosK:(x,(t’)— x(t’) 


aK 
4r°k? 


[exp(—ik(t’’—2)) cosK: (xa(t’) —xm(2)) 


— 2 exp(—7k(t”—2)) cosK: (xa(t)— X(t") Jp. (21) 
is one which comes from the limits of integration at ¢’ 
and ¢”, and involves the coorctinates of the particle at 
either one of these times or the other. If ¢’ and ¢’’ are 
considered to be exceedingly far in the past and future, 
there is no correlation to be expected between these 
temporally distant coordinates and the present ones, 
so the effects of Jtransient will cancel out quantum 
mechanically by interference. This transient was pro- 
duced by the sudden turning on of the interaction of 
field and particles at ¢’ and its sudden removal at 2”. 
Alternatively we can imagine the charges to be turned 
on after ¢’ adiabatically and turned off stowly before ¢’” 
(in this case, in the term Z,, the charges should also be 
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considered as varying with time). In this case, in the 
limit, /transient is zero. Hereafter we shall drop the 
transient term and consider the range of integration of 
! to be from —o to +o, imagining, if one needs a 
definition, that the charges vary with time and vanish 
in the direction of either limit. 

To simplify R we need the integral 


j= f exp(— ik|¢|) cos(K- R)d?K/82°2 


={ exp(—ik|t|) sin(kr)dk/2ar (22) 
Jo 


where + is the length of the vector R. Now 


fe exp(—ikx)dk= lim (—i(x—ie)—) 
° = — ix! +8(x) = 154(x) 


where the equation serves to define 5,(x) [as in II, 
Eq. (3)]. Hence, expanding sin(&r) in exponentials find 


J=—(4ary-((|e| 1) = (|| +r) 
+ (4ir)-(8(|¢| -r) —0(| 4] +7) 
= = (29)“(P= 2)“ (21)-18(2— 7) 
=—4i84(2-7*) (23) 


where we have used the fact that 
5(P—7*) = (27)-1(5(ft| —7) +.8([t] +7)) 


and that 4(|¢|++7)=0 since both |é| and r are neces- 
sarily positive. 
Substitution of these results into (19) gives finally, 


nm 


1 +o Ato 
k= [. f €n€m(1—X° u(t) *X'n(5)) 


X 64.((t— 5)?— (xn(t)— Xm(s))?)dlds, (24) 


The total complex action of the system is then” 
Sp,+R. Or, what amounts to the same thing; to obtain 


® One can obtain the final result, that the tolal interaction is 
just R, in a formal manner starting from the Hamiltonian from 
which the longituctina! oscillators have not yet heen eliminated. 
There are for each K and cos or sin, four oscillalors 9, corre- 
sponding to the three components of the vector potential (u=1 
2, 3) and the scalar potential (a= 4). It must then be assumed 
that the wave functions of the initial and final state of the 
oscillators isthefunction (&/r) expl— $4(¢.K? + 9.x? + 9K? 94K?) J. 
The wave function suggested here has only formal significance, 
of course, because the dependence on ¢,x is not square integrable, 
and cannot be normalized. If cach oscillator were assumed 
actually in the ground state, the sign of the ¢,x term would be 
changed to positive, and the sign of the frequency in the contrl- 
bution of these osctllators would be reversed (they would have 
negative energy’). 

© The classical action for this problem is just Sp+R’ where R’ 
ts the real part of the expression (24). In view of the geuteralization 
of the Lagrangian formulation of quantum mechanics suggested 
in Section 12 of C, one might have anticipated that R would have 
been simply &’. This corresponds, however, to houndary condi- 
tions other than no quanta present in past and future. It is 
harder to interpret physically. For a system eticlosed in a light 
tight box, however, tt appears likely that both R and R’ lead to 
the same results. 
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trauisition amplitudes including the effects of the field 
we must calculate the transition element of exp(iR): 


(xu fexpiR | Ye) Sp (25) 


under the action S, of the particles, excluding inter- 
action. Expression (24) for R must be considered to be 
written in the usual manner as a Riemann sum and the 
expression (25) interpreted as defined in C [Eq. (39)]. 
Expression (6) must be used for x’, at time /. 
Expression (25), with (24), then contains alt the 
effects of virtual quanta on a (at least non-relativistic) 
system according to quantum electrodynamics. It con- 
tains the effects to all orders in e?/fic in a single expres- 
sion. If expanded in a power series in e?/fic, the various 
terms give the expressions to the corresponding order 
obtained by the diagrams and methods of II. We 
illustrate this by an example in the next section, 


5. EXAMPLE OF APPLICATION OF EXPRESSION (25) 


We shall not be much concerned with the non- 
relativistic case here, as the relativistic case given below 
is as simple and more interesting. It is, however, very 
similar and at this stage it is worth giving an example 
to show how expressions resulting from (25) are to be 
interpreted according to the rules of C. For exainple, 
consider the case of a single electron, coordinate x, 
either free or in an external given potential (contained 
for simplicity in S,, not in! R). Its interaction with 
the field produces a reaction back on itself given by R 
as in (24) but in which we keep only a single term 
corresponding to m=n. Assume the effect of R to be 
small and expand exp(iR) as 1+7R. Let us find the 
amplitude at time ¢” of finding the electron in a state 
y with no quanta emitted, if at time /’ it was in the 
same state. It is 


(her [L+IR] Ye) sp=(ber| 1 Po) Sptihe| Rl Posy 


where (Per(Llpe)sp=explL-iE(t’—/’)] if E is the 
energy of the state, and 


Wel Reds —kef af avd d—xex) 


X54.((t—s)?— (x= x5)) | Pu)Sp. 


Here x,=x(s), etc. In (26) we shall limit the range of 
integrations by assuming s<i, and double the result. 
The expression within the brackets ( )sp, on the 
right-hand side of (26) can be evaluated by the methods 
described in C [Eq. (29)]. An expression such as (26) 


(26) 


"One can show from (25) how the correlated effect of many 
aloms at a distance produces on a given system the effects of an 
external potential. Formula (24) yields the result that this 
potential is that obtained from Liénard and Wiechert by retarded 
waves arising from the charges and currents resulting from the 
distant atoms making transitions. Assume the wave functions 
x and y can be split into products of wave functions for system 
and distant atoms and expand exp(/R) assuming the effect of any 
individual distant atom is small. Coulomb potentials arise even 
from nearby particles if they are moving slowly. 


445 


can.also be evaluated directly in terms of the propaga- 
tion’ kernel K(2, 1) [see 1, Eq. (2)] for an electron 
moving in the given potential. 

The term x’,:x’, in the non-relativistic case produces 
an interesting complication which does not have an 
analog for the relativistic case with the Dirac equation. 
We discuss it below, but for a moment consider in 
further detail expression (26) but with the factor 
(1—x’,-x".) replaced simply by unity. 

The kernet K(2,1) is defined and discussed in I. 
From its definition as the amplitude that the electron 
be found at x2 at time ¢a, if at ¢; it was at X;, we have 


K (Xa, 23 X1, 41) = (8(X—X2) to] 1|5(x—x)u)sp (27) 


that is, more simply K(2, 1) is the sum of exp(i5,) over 
all paths which go from space time point 1 to 2. 

In the integrations over all paths implied by the 
symbol in (26) we can first integrate over all the x; 
variables corresponding to times 4, from 2’ to s, not 
inclusive, the result being a factor K(x,, 5; Xv, ¢’) ac- 
cording to (27). Next we integrate on the variables be- 
tween sand /not inclusive, giving a factor K(Xxz, 4;Xs, 5) 
and finally on those between ¢ and /” giving 
K(xer,t; x4 0). Hence the left-hand term in (26) 
excluding the x*,-x’, factor is 


~2 far fas fro, UDK (Ke, U5 Xp N54 (E— 5)? 


= (Xe Xe)?) K(X, 45 Xs, SVK (Xs, 55 Xe, LY 


Xo(xe, EOE Xi OP XidxPXv (28) 
which in improved notation and in the relativistic case 
is essentially the result given in IL. 

We have made use of a special] case of a principle 
which may be stated more generally as 


(xe | E(x, 15 Xe bes Xk, Le) Pe Sp 
= fou) K (wer U5 Xa, Li) K (Xt, £15 Xa, b2)* °° 


XK K(Xa1, be-15 Xky La) K (xa, Le Xe, LY 
+ F (x1, 15 Xo, l25 ++ Xe, ba) (Xer) 


X d?xyid3x1d*x9- + -d3xyd3xy (29) 
where F is any function of the coordinate x, at time 4, 
X2 al ¢2 up to Xz, és, and, it is important to notice, we 
have assumed (’>4>>-+-&>/'. 

Expressions of higher order arising for example from 
R? are more complicated as there are quantities referring 
to several different times mixed up, but they all can be 
interpreted readily. One simply breaks up the ranges of 
integrations of the time variables into parts such that 
in each tle order of time of each variable is definite. 
One then interprets each part by formula (29). 
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As a simple example we may refer to the problem of 
the transition clement 
‘) 


Cx fect. oa frac, ss 


arising, say, in the cross term in U and V in an ordinary 
second order perturbation problem (disregarding radia- 
tion) with perturbation potential U(x, 0+ V(x, 0. In 
the integration on s and ¢ which should inctude the 
entire range of time for each, we can split the range of 
s into two parts, s<¢ and s>4. In the first case, s<, 
the potential V acts earlier than U, and in the other 
range, vice versa, so that 
be) 


(x: fue, bat f Va s)ds 
~f af ds f xs K (erg "5X4 0 


KU (x, OR (Xe 6; Xe S)V (Xs, 5) 


HK (Xe 55 Xe, LOY (Xe Oxyd Xd 3x 2X 


a ag 
+f af ds f x*Ge KG, Us Xs, 5) 
i t 


XV (Xs, SK (Xs, 55 Xt, OU (Xe, 4) 


K(x, 15 Xe, LYY (Ke )d Xd Xd ?x dX (30) 
so that the single expression on the left is represented 
by two terms analogous to the two terms required in 
analyzing the Compton effect. It is in this way that 
the several terms and their corresponding diagrams 
corresponding to cach process arise when an attempt 
is made to represent the transition elements of single 
expressions involving time integrals in terms of the 
propagation kernels K. 

It remains to study in more detail the term in (26) 
arising from x'(t)-x'(s) in the interaction. The interpre- 
tation of such expressions is considered in detail in C, 
and we must refer to Eqs. (39) through (50) of that 
paper for a more thorough analysis. A similar type of 
term also arises in the Lagrangian formulation in 
simpler problems, for example the transition element 


(x +) 


arising say, in the cross term in A and B in a second- 
order perturbation problem for a particle in a per- 
turbing vector potential A(x, 4)+B(x, 4. The time 
integrals must first be written as Riemannian sums, the 
velocity (see (6)) being replaced by x= 3e7(X,41—X,) 
+He“(x,—Xx-1) so that we ask for the transition 


(<(-AGx(, Hat f x-(s)B(x(s), s)ds 


v 
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element of 


DUG Kin xy) +3 — x1) - ARK, 4) 


X03 41-X,) +43(x,—x)-1)] B(x;, 4). 31 
In C it is shown that when converted to operator 
notation the quantity (x,1—2,)/e is equivalent (nearly, 
see below) to an operator, 


(a141—%,)/e—i (I x— x1) (32) 


operating in order indicated by the time index i (that 
js after x,’s for /<7 and before all x)’s for />7), In non- 
relativistic mechanics i(//x—xJI) is the momentum 
operator p, divided by the mass m. Thus in (31) the 
expression [4(x,41—x,)+4(x,—x,_1)]+A(x,, ¢,) becomes 
e(p-A+A-p)/2m. Here again we must split the sum 
into two regions 7<i and 7>i so the quantities in the 
usual notation will operate in the right order such that 
eventually (31) becomes identical with the right-hand 
side of Eq. (30) but with U(x!) replaced by the 
operator 


1 sla 1a 
(55 Ale, +A, 9-- —) 


2m\i Ox, t OX, 


standing in the same place, and with the operator 


110 1 F) 
—(-—Be, 5)+-B(x,, sy) 
i OXs 


2m\i Ox, t 


standing in the place of V(x,, 5). The sums and factors 
¢e have now become fdi fds, 

This is nearly but not quite correct, however, as there 
is an additional term coming from the terms in the sum 
corresponding to the special values, j7=7, 7=i+1 and 
j=i-~-1. We have tacitly assumed from the appearance 
of the expression (31) that, for a given 1, the contribu- 
tion from just three such special terms is of order ¢?. 
But this is not true, Although the expected contribution 
of a term like (%441—2,)(4)4)—4,) for ji is indeed of 
order &, the expected contribution of (x.4:—2,)? is 
+iem— (C, Eq. (50)], that is, of order ¢. In non- 
relativistic mechanics the velocities are unlimited and 
in very short times € the amplitude diffuses a distance 
proportional to the square root of the time. Making 
use of this equation then we see that the additional 
contribution from these terms is essentiatly 


ims, 1) BOX, t= im f AGL, t) -B(x(2), dt 


when summed on al] i, This has the same effect as a 
first-order perturbation due to a potential A-B/m, 
Added to the term involving the momentum operators 
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we therefore have an additional term” 


oid 
~f af xt (Ke )K (Ke, 15 Xu DAK, BCX 2) 
MY ys 


K(X t5 Xe, UW (xy) OP Xe-E xd xy, (33) 


In the usual Hamiltonian theory this term arises, of 
course, from the term A?/2m in the expansion of the 
Hamiltonian 


H = (2m)-\(p~ A)?= (2m)-\(p?— p- A—A-p+ A’) 


while the other term arises from the second-order action 
of p-A+A-p, We shall not be interested in non- 
relativistic quantum electrodynamics in detail. The 
situation is simpler for Dirac electrons. For particles 
satisfying the Klein-Gordon equation (discussed in 
Appendix A) the situation is very similar to a four- 
dimensional analog of the non-relativistic case given 
here, 


6, EXTENSION TO DIRAC PARTICLES 


Expressions (24) and (25) and their proof can be 
readily generalized to the relativistic case according to 
the one electron theory of Dirac, We shall} discuss the 
hole theory later. In the non-relativistic case we began 
with the proposition that the amplitude for a particle 
to proceed from one point to another is the sum over 
paths of exp(iSy), that is, we have for example for a 
transition element 


Gl} = tien foe fx aw) oC Baran 0) 
+W(Xo)d*xod?x,---d3xy (34) 


where for exp(iS,) we have written ®,, that is more 
precisely, 
®,= TI, A expiS(x;41, X,), 


As discussed in C this form js related to the usual 
form of quantum mechanics through the observation 
that 

(Xi4a|Xe)a= A expliS(xi41, Xi) J (35) 


where (x;+){X,)e is the transformation matrix from a 
representation in which x jis diagonal at time /; to one 
in which x js diagonal at time /;4,=¢;+e (so that it is 
identical to Ko(X;41, 4415%., 4) for the small time 
interval ¢«), Hence the amplitude for a given path can 
also be written 


b= TL(xXi41| Xe (36) 


for which form, of course, (34) is exact irrespective of 
whether (X;41|X,). Can be expressed in the simple form 
(35). 


For a Dirac electron the (x;4;]x;). is a 4X4 matrix 


® The term corresponding to this for the self-energy expression 
(26) would give an integrat over 8,((f—N?—(X:—x,)?) which Is 
evidently infinite and leads to the quadratically divergent self- 
energy. There is no such term for the Dirac electron, bul there 
ts for Ktein-Gordnn particles. We shall not discuss the infinities 
in this paper as they have already been discussed in II. 
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(or 4% 4” if we deal with N electrons) but the expres- 
sion (34) with (36) is still correct (as it is in fact for 
any quantum-mechanical system with a sufficiently 
general definition of the coordinate x,), The product 
(36) now involves operators, the order in which the 
factors are to be taken js the order in which the terms 
appear in time. 

For a Dirac particle in a vector and scalar potential 
(times the electron charge e) A(x, 4), Aa(x, 1), the 
quantity (x:411x;).( is related to that of a free particle 
to the first order in € as 


(xsg a] Xe? = (xs41] x1) exp[ — i(eAa(x,, 45) 
—(%417%,) ACs, 4))]. (37) 


This can be verified directly by substitution into the 
Dirac equation,? It neglects the variation of A and Ag 
with time and space during the short interval «. This 
produces errors only of order e in the Dirac case for 
the expected square velocity (x:41—x,)?/e? during the 
interval « js finite (equaling the square of the velocity 
of light) rather than being of order 1/¢ as in the non- 
relativistic case. [This makes the relativistic case 
somewhat simpler in that it is not necessary to define 
the velocity as carefully as in (6); (xi41—X,/e 1S 
sufficiently exact, and no term analogous to (33) arises, ] 

Thus ,‘) differs from that for a free particle, 6), 
by a factor TI, exp—1(eA a(x, 4) (X41 X41) A(X, 4,)) 
which in the limit can be written as 


exp| i fCAaO,9-O-AG, OM} G8) 


exactly as in the non-relativistic case, 

The case of a Dirac particle interacting with the 
quantum-mechanical oscillators representing the field 
may now be studied, Since the dependence of #,‘*) on 
A, Aq is through the same factor as in the non-relativ- 
istic case, when A, Ay are expresSed jn terms of the 
oscillator coordinates g, the dependence of ® on the 
oscillator coordinates g is unchanged, Hence the entire 
analysis of the preceding sections which concern the 
results of the integration over oscillator coordinates 
can be carried through unchanged and the results will 
be expression (25) with formula (24) for R, Expression 
(25) is now interpreted as 


(xerfexpiR | ye) = tim fixe, Xy1@Qe re) 
e 


KIL (@ p, nd 9x ped EK ro + od XL) 
rexp(iR)Y (Xu, xr.) (39) 


13 Alternatively, note that Eq. (37) is exact for arbitrarily large 
e if the potential A, is constant, For if the potential in the Dirac 
equation is the gradient of a scalar function Ay=4x/dz, the 
potential may be removed by replacing the wave function by 
y=exp(—ix)¥’ (gauge lransformation). This alters the kernel by 
a factor expf — i(x(2)—x(1))] owing to the change in the initial 
and final wave functions. A constant potential A, is the gradient 
of x= A,yx, and can be completely removed by this gauge lrans- 
formation. so that the kernel differs from that of a free particle 
by the factor exp[—7(A pxy2— A p%y1) ] as in (37). 
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where &,,,, the amplitude for a particular path for 
particle is simply the expression (36) where (x,41]Xi)« 
is the kernel Ko. n(x.41, £413 X,", ¢,) for a free elec- 
tron according to the one electron Dirac theory, with 
the inatrices which appear operating on the spinor 
indices corresponding to particle (#7) and the order of 
all operations being determined by the time indices, 

For calculational purposes we can, as before, expand 
R as a power series and evaluate the various terms in 
the same manner as for the non-relativistic case. In 
such ah expansion the quantity x‘(t) is reptaced, as we 
have seen in (32), by the operator 7(//x—x//), that is, 
in this case by « operating at the corresponding time. 
There is no further complicated term analogous to (33) 
arising in this case, for the expected value of (2.4.1— 21)? 
is now of order é rather than e. 

For example, for self-energy one sees that expression 
(28) will be (vith other terms coming from those with 
x'() replaced by @ and with the usual 8 in back of 
each Ky because of the definition of Ko in relativity 
theory) 


(be | Rl] be) Sp= = f poo Kolxer Xt, DB ay 


+ B4.((L~- 5)? (X1— X,)*) Ku(Xe, £5 Xs, S)B ay 


Ko(Xs, 85 Xr U)BY(Xy dX vd? x,d3x,d*xydlds, (40) 


where a,=1, @1,23=@:,, and a sutn on the repeated 
index » is implied in the usual way; by = d4b4—~ 0b, 
— deb2— 2363. One can change Ba, to y, and y* to PB, 
In this manner all of the rules referring to virtual 
photons discussed in II are deduced; but with the 
difference that Ko is used instead of K4 and we have 
the Dirac one electron theory with negative energy 
states (although we may have any number of such 
electrons). 


7. EXTENSION TO POSITRON THEORY 


Since in (39) we have an arbitrary number of elec- 
trons, we can deal with the hole theory in the usual 
manner by imagining that we have an infinite number 
of electrons in negative energy states, 

On the other hand, in paper I on the theory of 
positrons, it was shown that the results of the hole 
theory in a system with a given external potential A, 
were equivalent to those of the Dirac one electron 
theory if one replaced the propagation kernel, Ko, by a 
different one, Ky. and multiplied the resultant ampli- 
tude by factor C, involving A,. We must now see how 
this relation, clerived in the case of external potentials, 
can also be carried over in electrodynamics to be 
useful in simplifying expressions involving the infinite 
sea of electrons. 

To do this we study in greater detail the relation 
between a problem involving virtual photons and one 
involving purely external potentials. In using (25) we 
shall assume in accordance with the hole theory that 
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the number of electrons is infinite, but that they all 
have the same charge, e. Let the states ¢, xr, repre- 
sent the vacuuin plus perhaps a number of real electrons 
ih positive energy states and perhaps also some empty 
negative energy states. Let us cal] the amplitude for 
the transition in an external potential B,, but excluding 
virtual photons, Tof B], a functional of B,(1). We have 
seen (38) 


ToL B I= (xe | expiP| bv) (41) 


where 


P= nE f Bux, = BED, ME 
by (38). We can write this as 
Pa -¥ f BG! (Oye (D4 


where x,(/)=¢ and z4=1, the other values of » corre- 
sponding to space variables. The corresponding ampli- 
tude for the same process in the same potential, but 
including al} the virtual photons we may call, 


T a BY = (xe-|exp(iR) expGP)| bv). (42) 


Now let us consider the effect on T,2[B] of changing 
the coupling e of the virtuat photons. Differentiating 
(42) with respect to e® which appears only “in R we find 


éT el BYd(e) = (xi —*55 a5 if dds,’ (L240 (5) 


+ 34.((4™ (£E)— 2,0 (s))?) expiR4 Pde). (43) 


We can also study the first-order effect of a change 
of B,: 


572 BY 5B,(1)= — iC xe" z= f dig 54(scal (0) — 20,1) 


-pi(R+P)|4e+) (44) 


where %2,; 18 the field point at which the derivative with 
respect to B, is taken" and the term (current density) 
Ln S dtey Data) —Xa.1) is just 6P/8B,(1). 
The function 84(xq'%—4—,1) means 8(44(" —24.1) 

4 In changing the charge ¢? we mean to vary only the degree 
to which virtual photons are important. We do not contemplate 
changes in the influence of the external potentials. If one wishes, 
as e is raised the strength of the potential is decreased propor~ 
tionatly so that By, the potential times the cbarge e, is betd 
constant. 7 ” 

'® The functional derivative is defined such that if 7[B] is a 
number depending on the functions B,(1), the first order variation 
in T produced by a change from B, to By +AB, is given by 


T[B+AB]~TUB]= f (8TLBY/8By(1))ABy(1)dr, 


the integral extending over all four-space %@,1. 
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XK (x30 — x4, (22 — ey, h(x —21, 1) that is, 6(2, 1) 
with %a.2=4%a'"(L). A second variation of T gives, by 
differentiation of (44) with respect to B,(2), 


&T 2[ B/3B,(1)5B,(2) 


2 — (xe: 


+84 (xa™ (L) — 0.1) 8*(x8'” (5) — X84, 2) 


vy J a f ds,” (Dt,0(s) 


Xexpi(R+P)/ Ye). 
Comparison of this with (43) shows that 


aT al BY/a(e)=4i f f (#T aL BY/SB,(1)8B,(2)) 


X54 (Si2*)dridr, (45) 
where s2= (2p, 7%, 2)(Xp, 1— yp, 2). 

We now proceed to use this equation to prove the 
validity of the rules given in II for electrodynamics. 
This we do by the following argument. The equation 
can be looked upon asa differential equation for T,2[ B]. 
It determines T.2[ B] uniquely if Ty)[B] is known. We 
have shown it is valid for the hole theory of positrons. 
But in I we have given formulas for calculating ToL B] 
whose correctness relative to the hole theory we have 
there demonstrated. Hence we have shown that the 
T 2B ] obtained by solving (45) with the initial condi- 
tion To[B] as given by the rules in I will be equal to 
that given for the same problem by the second quant- 
ization theory of the Dirac matter field coupled with 
the quantized electromagnetic field. But it ts evident 
(the argument is given in the next paragraph) that the 
rules’ given in II constitute a solution in power series 
in ée of the Eq. (45) [which for e?=0 reduce to the 
ToL B] given in I]. Hence tie rules in II must give, to 
each order in é, the matrix element for any process 
that would be calculated by the usual theory of second 
quantization of the matter and electromagnetic fields. 
This is what we aimed! to prove. 

That the rules of II represent, in a power series 
expansion, a solution of (45) is clear. For the rules 
there given may be stated as follows: Suppose that we 
have a process to order & in e? (i-e., having & virtual 
photons) and order in the external potential By. 
Then, the matrix element for the process with one more 
virtual photon and two less potentials is that obtained from 


1¢ That is, of course, those rules of IT which apnly to the un- 
modified electrodynamics of Dirac clectrons. (The limitation 
excluding real photous in the initial and final states is removed 
in Sec, 8.) The same arguments clearly apply to nucleons inter- 
acting via neutral vector mesons, vector coupling. Other couplings 
require a minor extension of the argument. The moclification 
to the (x,41{%), as in (37), produced by some counlings cannot 
very easily be written without using operators in the exponents. 
These operators can be treated as numbers if their order of oper- 
ation is maintained to be always their order in time. This idea 
will be discussed and applied more generally in a succeeding paper. 


INTERACTION 449 
the previous matrix by chuosing from the n potentials a 
pair, say B,(L) acting at 1 and B,(2) acting at 2, replacing 
them by 1e8,»54(si2°), adding the results for each way of 
choosing the pair, and dividing by k+1, the present 
number of photons. The matrix with no virtual photons 
(k=0) being given to any a by the rules of I, this 
permits terms to all orders in e? to be derived by 
recursion. It is evident that the rule in italics is that of 
II, and equally evident that it is a word expression of 
Eq. (45). [The factor } in (45) arises since in integrating 
over all dr; and dr2 we count each pair twice. The 
division by £41 is required by the rules of II for, 
there, each diagram is to be taken only once, wliile in 
the rule given above we say wliat to do to add one 
extra virtual photon to & others. But which one of the 
k+1 is to be identified at the last photon added is 
irrelevant. It agrees with (45) of course for it is canceled 
on differentiating with respect to e? the factor (e?)*+! 
for the (k+1) photons. } 


8. GENERALIZED FORMULATION OF QUANTUM 
ELECTRODYNAMICS 


The relation implied by (45) between the formal 
solution for the amplitude for a process in an arbitrary 
unquantized external potential to that in a quantized 
field appears to be of much wider generality. We shall 
discuss the relation from a more general point of view 
here (still limiting ourselves to the case of no photons 
in initial or final state). 

In earlier sections we pointed out that as a conse- 
quence of the Lagrangian form of quantum mechanics 
the aspects of the particles’ motions and the behavior 
of the field could be analyzed separately. What we did 
was to integrate over the field oscillator coordinates 
first. We could, in principle, have integrated over the 
particle variables first. That is, we first solve the 
problem with the action of the particles and their 
interaction with the field and then multiply by the 
exponential of the action of the field and integrate over 
all the field oscillator coordinates. (For simplicity of 
discussion let us put aside from detailed special con- 
sideration the questions involving the separation of the 
longitudinal and transverse parts of the field.2) Now 
the integral over the particle coordinates for a given 
process is precisely the integral required for the analysis 
of the motion of the particles in an unquantized po- 
tential. With this observation we may suggest a 
generalization to all types of systems. 

Let us suppose the formal solution for the amplitude 
for some given process with matter in an external 
potential B,(1) is some numerical quantity To. We 
mean matter in a more general sense now, for the 
motion of the matter may be described by the Dirac 
equation, or by the Klein-Gordon equation, or may 
involve charged or neutral particles other than electrons 
and positrons in any manner whatsoever. The quantity 
To depends of course on the potential function B,(1); 
that is, it is a functional To[B] of this potential. We 
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assume we have some expression for it in terms of B, 
(exact, or to some desired degree of approximation in 
the strength of the potential), 

Then the answer T.2[ 8] to the corresponding prob- 
Jem in quantum clectrodynamics is Tal A,()+8,(1)] 
XKexplis,) summed over all possible distributions of 
fiel) 4,01), wherein .S, is the action for the field 
Sy= — Bre iS, S011)? — (0A) dxdt the sum 
on p carrying the usual minus sign for space compo- 
nents 

If /C.1] is any functional of 4,(1) we shall represent 
by of FE.t fia this superposition of FL.1] exp(iSa) over 
distributions of A, for the case in which there are no 
photons in initial or final state. That is, we have 


TL Bl=o| TLA+ Bo. 


The evaluation of o| [A ]!o directly from the defini- 
tion of the operation »| |» is not necessary. We can 
give the result in another way. We first note that the 
operation is linear, 


FLAJ+ FLA Jlo=0l FLA Jotol MLA J/o 


so that if F is represented as a sum of terms each term 
cal be analyzed separately. We have stuclied cssentially 
the case in which F[-l] is an exponential function. In 
fact, what we have done in Section 4 may be repeated 
with slight modincation to show that 


Jeso(-i fina.cyar) 
=exo( ~He ff jnlt)ia2)4(0u8)dridrs) (48) 


where 7,(1) is an arbitrary function of position and 
time for each value of yu. 

Although this gives the evaluation of 5| |» for only 
a particular functional of 4, the appearance of the 
arbitrary function 7,(1) makes it sufficiently general to 
permit the evaluation for ally other functional. For it 
is to be expecred that any functional can be represented 
as a superposition of exponentials with different func- 
tions j,(1) (by analogy with the principle of Fourier 
integrals for ordinary functions). Then, by (47), the 
result of the operation is the corresponding superposi- 
tion of expre sions equal ty the right-hand side of (48) 
with the various /’s substituted for j,. 

In many applications /[4 ] can be given as a j.ower 
series in A,: 


(46) 


(47) 


0 


FLA)= fot a fod 


+f f foot, DAMA Ddndrat (49) 


where fin fu(D), fur(1, 2)++- are known numerical func- 
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tions independent of A,. Then by (47) 


PCAN o=fot f f.(Dol 140 }adn 


+f f foot, 2)o| Ay(1)elv(2)fadridre-t<-  (50) 


where we set 9/1 {o=1 (from (48) with j,=0). We can 
work out expressions for the successive powers of A, 
by differentiating both sides of (48) successively with 
respect tu j, and setting 7,=0 in each derivative. For 
example, the first variation (derivative) of (48) with 
respect to j,(3) gives 


—tA,(3) exp( ~ i fina (yin) 


se f 5 (s31°)ju(Bare 


xew(—1e f ficnias(s2yindr). (S1) 


Setting j,=0 gives 


0 0 


of Ay(3){o=0. 


Differentiating (51) again with respect to j,(4) and 
setting j,=0 shows 


of Au(3)A,(4) | o = 10554 (S42) (52) 


and so on for higher powers. These results may be 
substituted into (50), Clearly therefore when To B+ 4 ] 
in (46) is expanded in a power series and the successive 
terms are computed in this way, we obtain the results 
given in II. 

It is evident that (46), (47), (48) imply that Te2[B] 
satisfies the differential equation (45) and conversely 
(45) witli the definition (46) implies (47) and (48). For 
if Ty[B] is an exponential 


TsLB\=exp( -i fj.0)8,(0)dn) (53) 


we have from (46), (48) that 


Tal B]=exp| — Hie ff Jn(jQ)8elort dnd] 


so] -7 f.enB.cdn (54) 


Direct substitution of this into Eq (45) shows it to be a 
solution satisfying the houndary condition (53). Since the 
differential equation (45) is linear, if Tol B] is a super- 
position of exponentials, the corresponding superposi- 
tion of solutions (54) is also a solution. 
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Many of the formal representations of the matter 
system (such as that of second quantization of Dirac 
electrons) represent the interaction with a fixed po- 
tential in a formal exponential form such as the left- 
hand side of (48), except that j,(1) is an operator 
instead of a numerical function. Equation (48) may 
still be used if care is exercised in defining the order of 
the operators on the right-hand side. The succeeding 
paper will discuss this in more detail. 

Equation (45) or its solution (46), (47), (48) consti- 
tutes a very general and convenient formulation of the 
laws of quantum electrodynamics for virtual processes, 
Its relativistic invariance is evident if it is assumed that 
the unquantized theory giving To[B] is invariant. It 
has been proved to be equivalent to the usual formula- 
tion for Dirac electrons and positrons (for Klein-Gordon 
particles see Appendix A). It is suggested that it is of 
wide generality. It is expressed in a form which has 
meaning even if it is impossible to express the matter 
system in Hamiltonian form; in fact, it only requires 
the existence of an amplitude for fixed potentials which 
obeys the principle of superposition of amplitudes, If 
To[ B] {s known in power series in B, calculations of 
Ta[{B] in a power series of & can be made directly 
using the italicized rule of Sec. 7, The limitation to 
virtual quanta is removed in the next section. 

On the other hand, the formulation is unsatisfactory 
because for situations of importance it gives divergent 
results, even if ToL B] is finite. The modification pro- 
posed in II of replacing 84(s1:7) in (45), (48) by (512) 
is not satisfactory owing to the loss of the theorems of 
conservation of energy or probability discussed in II at 
the end of Sec. 6, There is the additional difficulty in 
positron theory that even T[B] is infinite to begin 
with (vacuum polarization), Computational ways of 
avoiding these troubles are given in II and in the refer- 
ences of footnote 2. 


9. CASE OF REAL PHOTONS 


The case in which there are rea] photons in the initial 
or the final state can be worked out from the beginning 
in the same manner,"” We first consider the case of a 
system interacting with a single oscillator. From this 
result the generalization will be evident. This time we 
shall calculate the transition element between an initial 
state in which the particle is in state ye and the 
oscilla tor is in its nth eigenstate (i.e., there are » photons 
in the field) toa final state with particle in xz, oscillator 
in mth level. As we have already discussed, when the 
coordinates of the oscillator are eliminated the result 
is the transition element (x¢"|Gmn| Ws) where 


Gan= f ena) #540 8 enladdaday (11) 


where ¢m, @n are the wave functions$ for the oscillator 


17 For an alternative method starting directly from the formula 
(24) for virtual photons, see Appendix B. 
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in State m, m and & is given in (12). The Gna can be 
evaluated most easily by calculating the generating 
function 

B(X, ¥)=Lim Lin GmaX"™¥"(mint-4 (55) 
for arbitrary X, Y. If expression (11) is substituted in 
the left-hand side of (55), the expression can be simpli- 


fied by use of the generating function relation for the 
eigenfunctions® of the harmonic oscillator 


Longo) ¥%(n t= (w/a)! exp(— piwt’) 
Xexp}[wgo? — (Y exp[ — iwt’]— (2w) 40)? ] 


Using a similar expansion for the gm* one is left with 
the exponential of a quadratic function of go and qj. 
The integration on go and gq; is then easily performed 
to give 

2(X, ¥)=Gooexp(X¥+is*X+ieV) (56) 
from which expansion in powers of X and Y and 
comparison to (11) gives the final result 


m! n! 
Ginn Goo(m 'n 1) ———_—— ———— 
+(m—r)!r! (n—r) Ir! 


Xri(GBr)"—"(iB)"-" (57) 
where Goo is given in (14) and 
” 
B= (2w)4 f (8) exp(~iwl)dt, 
j (s8) 


B= (20) fF y() exp( tint 


w 


and the sum on 7 is to go from 0 to m or to n whichever 
is the smaller. (The sum can be expressed as a Laguerre 
polynomial but there is no advantage in this.) 
Formula (57) is readily understandable. Consider 
first a simple case of absorption of one photon. Initially 
we have one photon and finally none. The amplitude 
for this is the transition element of Go1=18Go9 or 
(xw+|18Goo| Wr). This is the same as would result if we 
asked for the transition element for a problem in 
which all photons are virtua] but there was present a 
perturbing potential —(2w)—!y(t) exp(—iwt) and we 
required the first-order effect of this potential. Hence 
photon absorption is like the first order action of a 
potential varying in time as y(t) exp(—7w#) that is with 
a positive frequency (i.e., the sign of the coefficient of ¢ 
in the exponential corresponds to positive energy). 
The amplitude for emission of one photon involves 
Gyo=18*Goo, which is the same result except that the 
potential has negative frequency, Thus we begin by 
interpreting i8* as the amplitude for emission of one 
photon 76 as the amplitude for absorption of one. 
Next for the general case of n photons initially and 
m finally we may understand (57) as follows. We first 
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neglect Bose statistics and imagine the photons as 
individual distinct particles, If we start with 2 and end 
with m this process may occur in several different ways. 
The particle may absorb in total x—r of tle photons 
and the final m photons will represent 7 of the photons 
which were present originally plus m—~7 new photons 
emitted by the particle, In this case the 2—7 which are 
to be absorbed may be chosen from among the original 
nin w!/(n—r) Ir different ways, and each contributes 
a factor 78, the amplitude for absorption of a photon. 
Which of the m~r photons from among the ™ are 
emitted can be chosen in m!/(#—r)!r! different ways 
and each photon contributes a factor 78* in amplitude. 
The initial y photons which do not interact with the 
particle can be re-arranged among the final r inv! ways, 
We must sum over the alternatives corresponding to 
different values of 7, Thus the form of Ga, can be 
understood, The remaining factor (m!)~#(1!)~! may be 
interpreted as saying that in computing probabilities 
(which therefore involves the square of Gna) the 
photons may be considered as independent but that if 
m are actually equal the statistical weight of each of 
the states which can be made by rearranging the m 
equal photons is only 1/m!, This is the content of Bose 
statistics; that mm equal particles in a given state 
represents just ote state, i.e, has statistical weight 
unity, rather than the m! statistical weight which 
would result if it is imagined that the particles and 
states can be identified and rearranged in mz! different 
ways, This holds for bot! the initial and fina! states of 
course. From this rule about the statistical weights of 
states the derivation of the blackbody distribution 
law follows, 

The actual electromagnetic field is represented as a 
host of oscillators each of which behaves independently 
and produces its own factor such as Grn. Initial or final 
states may also be linear combinations of states in 
which one or another oscillator is excited. The results 
for this case are of course the corresponding lincar 
combination of transition elements. 

For photons of a given direction of polarization and 
for sin or Cos waves the explicit expressiots for 8 can be 
obtained directly from (58) by substituting the formulas 
(16) for the y’s for the corresponding oscillator. It is 
ore convenient to use the linear combination corre- 
sponding to running waves. Thus we find the amplitude 
for absorption of a photon of momentum K, frequency 
k= (K-K)! polarized in direction e is given by including 
a factor z times 


“we 


Bk,e= (4m) 1(any-"Ee f exp(—tht) 


t 


XexpGK-x.Q)e-x n(Ddt (59) 


in the transition element (25), The density of states in 
momentum space is now (27)~%d3K, The amplitude 
for emission is just 7 times the complex conjugate of 
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this expression, or what amounts to the same thing, 
the same expression with the sign of the four vector ky 
reversed, Since the factor (59) is exactly the first-order 
effect of a vector potential 


APH = (2n/k)}e xp(—i(kt— K-x)) 


of the corresponding classical wave, we have derived 
the rules for handling real photons discussed in LL. 

We can express this directly in terms of the quantity 
T 2B}, the amplitude for a given transition without 
emission of a photon. What we have said is that the 
amplitude for absorption of just one photon whose clas- 
sicat wave form is A,?#(1) (time variation exp(—ikt,) 
corresponding to positive energy &) is proportional to 
the first order (in e) change produced in T.2[B] on 
changing B to B4+«A?*, That is, more exactly, 


foreteyesayarean (60) 


is the amplitude for absorption by the particle system 
of one photon, A’. (A superposition argument shows 
the expression to be valid not only for plane waves, 
but for spherical waves, etc., as given by the form of 
AP.) The amplitude for emission ts the same expression 
but with the sign of the frequency reversed in AP, 
The amplitude that the system absorbs two photons 
with waves A,?#t and 21,7”? is obtained from the next 
derivative, 


f f (PT ef BY SB, (1)8B,(2))A,P™(1)A ,P2(2)d rid ra, 


the same expression holding for the absorption of one 
and emission of the other, or emission of both depending 
ons the sign of the time dependence of 4?" and 4°42, 
Larger photo) numbers correspond to higher deriva~ 
tives, absorption of /, emission of /, requiring the 
(+d:) derivaties, When two or more of the photons 
are exactly the same (e.g., A’1=A?A2) the sume 
expression holds for the amplitude that 2; are absorlyed 
by the system while /) are emitted, Llowever, the 
Statement that initially » of a kind are present and m 
of this kind are present finally, does not imply \=n 
and =m, It is possible that only »—7r=/, were 
absorbed by the system and m~r=/, emitted, and that 
rremained from initial to final state without interaction. 
This term is weighed by the combinatorial coefficient 


m\ {2 fiat 
ominty-a( ; ) C yr and summed over tlte possililities 


for 7 as explained in connection with (57), Thus onee 
the amplitude for virtual processes is known, that for 
real photon processes Can be obtained by differentiation, 

It is possible, of course, to deal with situations in 
which the electromagnetic field is not in a cletinile state 
after the interaction, For example, we might ask for 
the total probability of a give process, stich as a 
scattering, without regard for the uumber of photons 
emitted. This is done of course by squaring the ampli- 


THEORY OF ELECTROMAGNETIC 


tude for the emission of m photons of a given kind and 
summing on all m. Actually the sums and integrations 
over the oscillator momenta can usually easily be 
performed analytically. For example, the amplitude, 
starting from vacuum and ending with m photons of a 
given kind, is by (56) just 


Gmo= (m!)—'Goo(t8*)™. (61) 


The square of the amplitude summed on m requires 
the product of two such expressions (the y(é) in the B 
of one and in the other will have to be kept separately) 
summed on m: 


Xm Gmo*Gino’ = Lm Goo*Goo' (m !)-16(8'*) n 
== Goo*Goo’ exp(8Q"*), 


In the resulting expression the sum over all oscillators 
is easily done. Such expressions can be of use in the 
analysis in a direct manner of problems of line width, 
of the Bloch-Nordsieck infra-red problem, and of sta- 
tistical mechanical problems, but no such applications 
will be made here. 

The author appreciates his opportunities to discuss 
these matters with Professor H. A. Bethe and Professer 
J. Ashkin, and the help of Mr. M. Baranger with the 
manuscript. 


APPENDIX A. THE KLEIN-GORDON EQUATION 


In this Appendix we describe a formulation of the equations 
for a particle of spin zero which was first used to obtajn 1he rules 
given in II for such particles, The complete physical significance 
of the equations has not been analyzed thoroughly so that it may 
be preferable to derive the rules directly from the second quanti- 
zation formulation of Pauli and Weisskopf, This can be done in a 
manner analogous to the <lerivation of the rules for the Dirac 
equation given in I or from the Schwinger-Tomonaga formulation* 
in a manner described, for example, by Rohrlich.* The formulation 
given here is therefore not necessary for a description of spin 
zero particles but is given only for its own interest as an alternative 
to the formulation of second quantization. 

We start with the Klein-Gordon equation 


(48/dx,— Ay) y= mty (1A) 


for the wave function y of a particle of mass m in a given external 
potential A,, We shall try lo represent this in a manner analogous 
to the formulation of quantum mechanics in C, That is, we try 
to represent the amplitude for a particle to get from one point to 
another as a sum over all trajectories of an amplitude exp(¢S) 
where S is the classical action for a given trajectory. To maintain 
the relativistic invariance in evidence the idea suggests ftself of 
describing a trajectory in space-time by giving the four variables 
%,(u) as functions of some fifth parameter # (rather than expressing 
X}, %2, Xs in terms of x), As we expect to represent paths which 
may reverse themselves in time (to represent pair production, 
etc. as in I) this is certainly a more convenient representation, 
for all four functions x,(#) may be considered as functions of a 
parameter % (somewhat analogous to proper time) which increase 
as we go along the trajectory, whether the trajectory is proceeding 
forward (dx,/du>0) or backward (dx<s/du <0) in time.'® We shall 


@ F. Rohrlich (to be published). 

The physical ideas involved in such a description are discussed 
tn detail by Y, Nambu, Prog. Theor. Phys. 5, 82 (1950). An 
equation of type (2A) extended to the case of Dirac electrons has 
ec dieu by V, Fock, Physik Zeits, Sowjetunion 12, 404 
(1937), 
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then have a new type of wave function (x, #) a function of five 
variablés, x standing for the four x. It gives the amplitude for 
arrival at point x, with a certain value of the parameter u, We 
shall suppose that this wave function satisfies the equation 


18. p/du= —}(48/dx,— A,)?e (2A) 


which is seen to be analogous to the time-dependent Schrédinger 
equation, « replacing the time and the four coordinates of space- 
time x, replacing the usual three coordinates of space. 

Since the potentials A,(x) are functions only of coordinates x, 
and are independent of #, the equation is separable in « and we 
can write a special solution in the form g=exp(Jim’u)y(x) where 
¥(x), a function of the coordinates x, only, satisfies (1A) and the 
eigenvalue }m* conjugate to u is related to the mass m of the 
particle, Equation (2A) is therefore equivalent to the Klein- 
Gordon Eq. (1A) provided we ask in the end only for the solution 
of (1A) corresponding to the eigenvalue 4s? for the quantity 
conjugate to u, 

We may now proceed to represent Eq. (2A) in Lagrangian form 
in general and without regard to this eigenvalue condilion. Only 
in the final solutions need we apply the eigenvalue condition. 
That is, if we have some special solution g(x, 1%) of (2A) we can 
select that part corresponding to the eigenvalue 4m? by calculating 


va) = [7 exp(— bina) of, w)du 


and thereby obtain a solution y of Eq, (1A), 

Since (2A) is so closely analogous to the Schrédinger equation, 
it is easily written in the Lagrangian form described in C, simply 
by working by analogy, For example if y(x, #) is known at one 
value of u its value at a slightly larger value «+ is given by 


r 
ole, ute f” expe] STEP 1K an) 44,2’) | 

+ p(x’, u)dsre(2aie)—(— aie) (3A) 
where (xy—2y’)? means (xy—2y')(xy—2u'), Cre duy dey day dx 
and the sign of the normalizing factor is chanyed for the xq 
component since the component has the reversed sign jn its 
quadratic coefficient in the exponential, in accordance with our 
summation convention @yb,=@4b«—4161—@2b2— 4363. Equation 
(3A), as can be verified readily as described in C, Ste. 6, is equiva- 
lent to first order in e, to Eq. (2A), Hence, by repeated use of this 
equation the wave function at =e can be represented in terins 
of that atu=0 by: 


in eo 2 
olen mw) f° exp-5 $[(@= est) 
re 


tEN xy i— rus D(AsCH)+4,(:-0) | 
«ol, 0, 0) TH (d'rs/4ntei), (4A) 
io 


That is, roughly, the amplitude for getting from one poini to 
another with a given value of uo is the sum over all trajectories 
of exp(tS) where 


Sa— fi Ga/du t+ (de/dwAsleyn, (SA) 


when sufficient care is taken to define the quantities, as in C, 
This completes the formulation for particles in a fixed potential 
but a few words of description may be in order, 

In the first place in the special case of a free particle we can 
define a kernal 2 (x, uo; x, 01 for arrival from x’, 0 to xy at to 
as the sum over all trajectories between these points of 
exp—7t S°4(dx,/du)*du, Then for this case we have 


(x, to) = fez, io; x", 0) p(x’, O)d'rz+ (6A) 
and it is easily verified that £0 is given by 
R(x, 40; x", O) == (4atug’t)—' exp—i(xy—4y")2/20 © (7A) 


for t#o>0 and by 0, by definition, for u<0, The corresponding 
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kernel of importance when we select the eigenvalue 42? is® 
2il 4(x, x’) -{° R(x, 205 x’, O) exp(— Fimo) duo 
= fica 4 xtucti) exp— 42(m1eo+ to” (ty — y")?) (8A) 


(the last extends only from t=O since ko is zero for negative to) 
which is identical to the 74 defined in II?! This may be seen 
readily by studying the Fourier transform, for the transform of 
the integrand on the right-hand side is 


f Grtusiy exp(ip-x) exp— 4i(m?uo+%,2/n0)d‘rz 
=cxp— }ito(m?— py?) 


so that the so integration gives for the transform of 7, just 
1/(py?— m?) with the poie defined exactiy as in II. Thus we are 
automatically representing the positrons as trajectories with the 
time sense reversed. 

Ef dC x(n) J=exp—1/o°d(dxy/di’du is the amplitude for a 
given trajectory x(t) for a free particle, then the amplitude in 
a potential is 


$M Lx(4) J= OL x(00) J expt f"(dey/di) A u@)du, (9A) 


If desired this may be studied by perturbation methods by 
expanding the exponential in powers of Ay. 

For interpretation, the integral in (9A) must be written as a 
Riemann sum, and if a perturbation expansion is mace, care must 
be taken with the terms quadratic in the velocity, for the effect 
Of (Xpi4t—Xp,i)(Xv,841— Xvi) is not of order @ but is —id,,€. The 
“velocity” dx,/du becomes the momentum oplerator pu= +70/dxy 
operating half before and half after .1,, just as in the non-relativ- 
istic Schrédinger equation discussed in Sec. 5. Furthermore, in 
exactly the same manner as in that case, but here in four dinten- 
sions, a term quadratic in Ay arises in the second-order pertirha- 
tion terms from the coincidence of two velocitics for the same 


value of 1. 
As an example, the kernat &£41(x, u0; x’, 0) for proceeding from 
ay’, 0 10 Xu, Mo ina potential A, differs from & lo frst order in 


A, by a term 
ni faux, 1405 ¥, WA PuA u(y) FA yy) pulh Cy, 05 2’, O)dry 


the p, here meaning +12/dy,. The kernel of importance on 
selecting the eigenvalue 4m? is obtained by multiplying this by 
exp(— fimo) and integrating mo from 0 to ©. The kernel 
k(x, to; y, #) depends onty on «’=t%9—K% and jn the integrals on 
wand tty; Sy dteSo°du exp(— Jimny) ++, can be written, on inter- 
changing the order of integration and changing variables to 
and u’, Sy°duSy°du’ exp(— fimn?(u+1’)) -> +. Now the integral on 
u’ converts k(x, 10; y, #) to 2i74(x, y) by (8A), while that on # 
converts &(y, #; x’, 0) to 2i/4(y, x’), so the result becomes 


[Pil se) budut AuPadl Oh # dry 


as expected, The same princijile works to any order so that the 
rules for a single Klein-Gordon particle in external potentials 
given in II, Section 9, are deduced. 

The transition to quantum electrodynamics is simple for in 
(SA) we already have a transition amplitude reptesented as a 
sum (over trajectories, and eventually mo) of terms, in each of 
which the potentia} appcars in exponential form. We may make 
use of the general relation (54). Hence, for example, one finds 


20 The factor 27 in front of /4 1s simnly to make the definition 
of 7, here agree with that in Tand TI. In II it operates with 
p:-A+A-p asa perturbation. But the perturhation coming from 
(3A) ip a natural way by expansian of the exponential is 
—4i(p- A+A-pi. 

2 Exnression (8A) is closely relatect to Schwinger’s plaranictric 
integra! reuresentacion of these functions. For example, (8A) 
hecoones farmuta (45) of F. Dyson, Phys. Rev. 75, 486 (1949) for 


Ap@AV— 2iA=2i7, if (2a)—! is substituted for ro. 
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for the case of no photons in the initia! and final states, in the 
presence of an external potential B,, the anyplitude that a particle 
proceeds from (xy’,0) to (xy, #0) is the sum over all trajectories 
of the quantity 


[fl puofdx.\? "uo dx, 
exp AGS Vane J. Tee Batwa de 


du 
e fue uo dxy(tt) dx (tn i $i , 
+f, fom Ao 5 Gault al ))dudu f (0A) 


This result must Le multiplied by exp(— 4229) and integrated 
on #o fron) zero Lo infinity to express the action of a Klein-Gordon 
particle acting on itself through virtual photons. The integrals 
are interpreted as Riemann suois, ard if perturbation expansions 
are made, the necessary care is taken with the terms quadratic 
in velocity. When there are several particles (other than the 
virtual pairs atready included) one use a separate a for cach, 
and writes the amplitude for each set of trajectories as the expo- 
nenta! of —7 times 
ma 


1 seg? ani sual da a 
ed, (& aut f, du Bulvy'™ (a) dee 


Fug pul? pul day (10) dxyo(1’) 
+9 2EY, i du du’ 


XK G4 CCC) — xy, (10’))2)dudn’, (11A) 


where x,'"(w#) are the coordinates of the trajectory of the mth 
particle. The solution should depend on the 1!” as 
extl(— fin? ito”). 

Actually, knowledge of the motion of a single charge implies a 
great deal about the behavior of several charges. For a pair 
which eventually may turn out to be a virtual pair may appear 
in the short run as two “other particles.” As a virtua! pair, that 
is, as the reverse section of a very long and complicated single 
track we know its behavior by (10A). We can assume that such a 
section can be looked at equally well, for a limited duration at 
least, as being due to other unconnected particles. This then 
implies a definite law of interaction of particles if the self-action 
(10A) of a single particle is known. (This is similar 10 the relation 
of real and virtua! photon processes discussed in detail in Appendix 
B.) It is possthle that u detailed analysis of this could show that 
(10A) implicd that (1LA) was correct for niany particles, There 
is even reason to believe that the law of Bose-Einstein statistics 
and the expression for contributions froin closed loops could be 
deduced by following this argument. This has not yet been 
analyzed completely, however, so we must leave this formulation 
in an incomplete form. ‘Phe expression for closed loops should 
come out to be Cy=exp+Z where L, the contribution from a 


2 The form (10A) suggests another interesting possibility for 
avoiding the divergences of qhaniumt clectrodynamics in this 
case. The divergences arisc from the 64 function when w=w’, 
We mighi restrict the integration in the dauble integral such that 
|u—2’|>6 where 6 is some finite quantily, very small compared 
with 72, More generally, we could keep the region w=x' from 
contributing by including in the integrand a factor F(u— 1’) 
where F(x)->1 for x large compared to some 4, and #(0) =0 (e.g., 
F(x) acts qualitatively like 1—exp(— 767%). (Another way might 
be to replace # by a discontinuous variable, that is, we do not 
use the fimit in (4.4) as e0 but set e=6.) The idea is that two 
interactions would contribute very little in amplitude if they 
followed one another too rapidly in uv. It is easily verified that 
this makes the otherwise divergent intcgritls finite. But whether 
the resulting formulas make goad physical sense is hard to see, 
The action af a potential would naw depetd on the value of # so 
that Eq, (2A), or its equivalent, wauld not be separah'e ia u so 
that Jar? would no longer bea strict cigeovalue far all disturbances. 
High energy piatentials could excite states carresponding to alher 
eigcovalucs, possibly thereby carresponding to ather masses. This 
note is meant only as a speculation, for oot cnough work has 
been done in this direction to make Sure thal a reasonably physical 
theory can be developed along these lines. (What little work bas 
been dene was not promising.) Analogous modifications can also 
be made for Dirac electrons, 
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single loop, is 
L=2 f “U(uo) exp(— fimuo)duo/uo 


where é(19) is the sum over all trajectories which close on them- 
selves (x,(1o)=2,(0)) of exp(4S) with S given in (SA), and a 
final integration dr.) on x4(0) is made, This is equivalent to 
putting 


Hsia) = fA, ts x, 0)— (x, tos x, O))Ers. 


The term &® is subtracted only to simplify convergence problems 
(as adding a constant independent of A, to L has no effect). 


APPENDIX B. THE RELATION OF REAL AND 
VIRTUAL PROCESSES 


If one has a genera! formula for ail virtual processes he should 
be able to find the formulas and states invotved in real processes, 
That is to say, we should be able to deduce the formulas of Section 
9 directly from the formulation (24), (25) (or its generalized 
equivalent such as (46), (48)) without having to go all the way 
back to the more usua! formulation, We discuss this problem here, 

That this possibility exists can be seen from the consideration 
that what looks like a real process from one pvint of view may 
appear as a virtual process occurring over a more extended time, 

For example, if we wish to study a given real process, such as 
the scattering of light, we can, if we wish, include in principle the 
source, scatterer, and eventua! absorber of the scattered light in 
our analysis. We may imagine that no photon is present initially, 
and that the source then emits light (the energy coming say from 
kinetic energy in the source). The Sight is then scattered and 
eventually absorbed (becoming kinetic energy in the absorber), 
From this point of view the process is virtual; that is, we start 
with no photons and end with none. Thus we can analyze the 
process by means of our formula for virtual processes, and obtain 
the formulas for real processes by attempting to break the analysis 
into parts corresponding to emission, scattering, and absorption.?? 

To put the problem in a more general way, consider the ampli- 
tude for some transition from a state empty of photons far in the 
past (time ¢’) to a similar one far in the future (t=/’’), Suppose 
the time interval to be split into three regions a, 5, ¢ in some 
convenient manner, so that region d is an interval t,>/>4) around 
the present time that we wish to study, Region a, (4>#>#), 
precedes b, and ¢, (#’>£>#,), follows 6. We want to see how it 
coines about that the phenomena during b can be analyzed by a 
study of transitions g;;(b) between some initial state ¢ at time f; 
(which no longer need be photon-free), to some other final state 7 
at time ts. The states ¢ and j are members of a large class which 
we will have to find out how to specify. (The single index 7 is 
used to represent a large number of quantum numbers, so that 
different values of ¢ will correspond to having various nuntbers of 
various kinds of photons in the field, etc.) Our problem is to 
represent the over-all transition amplitude, g(a, b,c), as a sum 
over various values of i, 7 of a product of three amplitudes, 


g(a, b,c) = Zi Z, goi(c)gis(b)gio(a) ; (1B) 


first the amplitude that during the interval @ the vacuum state 
makes transition to some state #, then the amplitude that during 
b the transition to j is made, and finally in ¢ the amplitude that 
the transition from 7 to some photon-free state 0 is completed. 


22 The formulas for real processes deduced in this way are 
strictly limited to the case in which the light comes from sources 
which are originally dark, and that eventually al! light emitted is 
absorbed again, We can only extend it to the case for which these 
restrictions «lo not hold by hypothesis, namely, that the details 
of the scattering process are independent of these characteristics 
of the light source and of the eventual! disposition of the scattered 
light. The argument of the text gives a method for discovering 
formulas for real processes when no more than the formula for 
virtual processes is at hand, But with this method belicf in the 
general validity of the resulting formulas must rest on the physical 
reasonableness of the above-mentioned hypothesis. 
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The mathematical problem of splitting g(a, 6, c) is made definite 
by the further condition that g,;(b) for given i, j must not involve 
the coordinates of the particles for times corresponding lo regions 
@ or ¢, gso(@) must involve those only in region a, and go,(c) only 
ine, 

To become acquainted with what is involved, suppose first that 
we do not have a problem involving virtua! photons, but just the 
transition of a one-dimensional Schrédinger particle going in a 
long time interval from, say, the origin o to the origin o, and ask 
what states i we shall need for intermediary time intervals. We 
must solve the problem (!B) where g(a, b,c) is the sum over al! 
trajectories going from o at ¢’ too at £’ of expiS where S= / Lat, 
The integral may be split into three parts S=Sa+So+5S. corre- 
sponding to the three ranges of time, Then exp(éS) = exp(iS.) 
-exp(#S»)-exp(z5a) and the separation (!B) is accomplished by 
taking for g,o(a) the sum over all trajectories lying in @ from o to 
some end point x, of exp(#Sq), for gj,(b) the sum over trajectories 
in b of exp(éSy) between end points x, and x,, and for go,(¢) the 
sum of exp(iS.) over the section of the trajectory lying in ¢ and 
going from x, to o, Then the sum on # and j can be taken to be 
the integrals on x/,, x respectively, Hence the various states # 
can be taken to correspond to partictes being at various coordi- 
nates x, (Of course any other representation of the states in the 
sense of Dirac’s transformation theory coutd be used equally well, 
Which, one, whether coordinate, momentum, or energy level 
representation, is of course just a matter of convenience and we 
cannot determine that simply from (1B).) 

We can consider next the problem including virtual photons, 
That is, g(2,b,¢) now contains an additional factor exp(R) 
where R involves a double integral /\/ over all time. Those 
parts of the index i which correspond to the particle states can 
be taken in the same way as though R were absent. We study 
now the extra complexities in the states produced by splitting 
the R, Let us first (solely for simplicity of the argument) take 
the case that there are only two regions a, ¢ separated by time 
to and try to expand 


8(4, €) = Zi goi(c) gio(a). 


The factor exp(@iR) involves R as a double integral which can be 
split into three parts JuofatScSetS aS c for the first of 
which both #, s are in a, for the second both are in ¢, for the third 
one is in @ the other inc, Writing exp(éR) as exp(#Ree) exp(iRaa) 
-exp(iRac) Shows that the factors Ree and Rag produce no new 
problems for they can be taken bodily into goi(c) and gio(a) 
respectively, However, we must disentangle the variables which 
are mixed up in exp(iRuc). 

The expression for Ras is just twice (24) but with the integral 
on s extending over the range a and that for ¢ extending over c. 
Thus exp(iRae) contains the variables for times in @ and in ¢ in 
a quite complicated mixture. Our problem is to write exp(iRae) 
as a sum over possibly a vast class of states ¢ of the product of 
two parts, like A:’(c)é:(@), each of which involves the coordinates 
In one interval alone, 

This separation may be made in many different ways, corre- 
sponding to various possible representations of the state of the 
electromagnetic field, We choose a particular one, First we can 
expand the exponential, exp(iRae), in a power series, as 
2 2"(n!)-( Rae)". The states ¢ can therefore be subdivided into 
subclasses corresponding to an Integer m which we can interpret 
as the number of quanta in the field at time , The amplitude 
for the case »=0 clearly just involves exp(#Raa) and exp(iRee) in 
the way that it should if we interpret these as the amplitudes for 
regions @ and ¢, respectively, of making a transition between a 
state of zero photons and another state of zero photons, 

Next consider the case n=!, This implies an additional factor 
in the transitional element; the factor Rac The variables are stil! 
mixed up, But an easy way to perform the separation suggests 
itself, Namely, expand the 54((£—s)*— (xn(#)—Xm(s))*) in Rae as 
a Fourier integral as 


if exp(—ik|t—s|) exp(—#K- (a(t) ~¥m(5)) PK /4xth, 
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For the exponential can be written immediately as a product of 
exp+i(K->xm(s)), a function only of coordinates for times s in a 
(suppose s<#), and exp—?K>x,(t) (a function only of coordinates 
during interval c). The integral on dK can be symbolized as a 
sum over states 7 characterized by the value of K. Thus the 
state with n=! must be further characterized by specifying a 
vector K, interpreted as the momentum of the photon, Finally 
the factor (1—x",(#)-x'n(s)) in Rae is Simply the sum of four parts 
each of which is already split (namely !, and each of the three 
components in the vector scalar product). Hence each photon of 
momentum K must still be characterized by specifying it as one 
of four varieties; that is, there are four polarizations.2* Thus in 
trying to represent the effect of the past @ on the future ¢ we are 
lead to invent photons of four polarizations and characterized by 
a propagation vector K. 

The term for a given polarization and value of K (for n=!) 
is clearly just — 8-8a* where the 8, is defined in (59) but with the 
time integral extending just over region a, while 8, is the same 
expression with the integration over region c. Hence the amplitude 
for transition during interval @ from a state with no quanta toa 
state with one in a given state of polarization and momentum is 
calculated by inclusion of an extra factor 78.* in the transition 
element. Absorption in region ¢ corresponds to a factor iB. 

We next turn to the case n= 2, This requires analysis of RaZ. 
The 6, can be expanded again as a Fourier integral, but for each 
of the two 5, in #Ra2 we have a value of K which may be different, 
Thus we say, we have two photons, one of momentum K and one 
momentum K’ and we sum over all values of K and K’. (Similarly 
each photon is characterized by its own independent polarization 
index.) The factor 4 can be taken into account neatly by asserting 
that we count each possible pair of photons as constituting just 
one stale at Lime fo. Then the 4 arises for the sum over ali K, K’ 
(and polarizations) counts each pair twice, On the other hand, for 
the terms representing two identical photons (K=K’) of like 
polarization, the 4 cannot be so interpreted, Instead we invent 
the rule that a state of two like photons has statistical weight 4 
as great as that calculated as though the photons were different, 
This, generalized to » identical photons, is the rule of Bose 
statistics. 

The higher values of » offer no problem, The 1/n! is interpreted 
combinatorially for different photons, and as a statistical factor 
when some are identical. For example, for al! identical one 
oblains a factor (#!)—(—@Be")® so that (#!)7!(iBa*)" can be 
interpreted as the amplitude for emission (from no initial photons) 
of # identical phere in complete agreement with (61) for Guo. 

To obtain the amplitude for transitions in which neither the 
initial nor the final state is empty of photons we must consider 
the more general case of the division into three time regions (1B), 
This time we see that the factor which involves the coordinates 
in an entangled manner is expi(Rast+RoetRac). It is to be 
expanded in the form 2, 2; t;’’(c)tts;’(b)4;(@). Again the expan- 
sion in power series and development in Fourier series with a 
polarization sum will solve the problem. Thus the exponential is 
2, Dh Dis (Rae)? (iRav) (ER ye)!*(Ly !) Mba!) “(rt 1. Now the R are 
written as Fourier series, one of the lerms containing 4+/2+7 
variables K, Since /it+7 involve a, /:+r involve ¢ and 4;4/, 
involve b, this term will give the amplitude that /;+7 photons 
are emilled during the interval a, of those /; are absorbed during 
db but the remaining 7, along with /; new ones emitted during b go 
on to be absorbed during the interval c, We have therefore 
n=l, -+r photons in the state al time f; when b begins, and m=/:+7 
al t, when b is over, They each are characterized by momentum 
vectors and polarizations. When these are different the factors 
i!) 2!) Hr) are absorbed combinatorially. When some are 
equa! we must invoke the rule of the statistical weights, For 


™ Usually only two polarizations transverse to the propagation 
vector K are used, This can be accomplished by a further re- 
afrangement of terms corresponding lo the reverse of the steps 
leading from (17) to (19). We omit the details here as it is well- 
known that either formulation gives the same results. See II, 
Section 8, 
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example, suppose all fi+/2+r photons are identical, Then 
Rov=iBeBa*, Roe=i8-Bo*, Rac=iPeBa* so that our sum is 


Zh Lily Zy a Mer HEB.) (EBs) MGB") EBat itr, 
Putting m=4+r, n=4,+7, this is the sum on # and m of 


(28.)™(m 1) “ILE, (me te (MM — 1) a7) Ir yt 

X Bo)" (EBL) "Wa NEB a*). 
The last factor we have seen is the amplitude for emission of # 
photons during interval a, while the first factor is the amplitude 
for absorption of m during ¢. The sum is therefore the factor for 
transition from # to m identical photons, in accordance with (57). 
We see the significance of the simple generating function (56), 

We have therefore found rules for real photons in terms of 
those for virtual, The real photons are a way of representing and 
keeping track of those aspects of the past behavior which may 
influence the future, 

If one starts from a theory involving an arbitrary modification 
of the direct interaction 54 (or in more general situations) it is 
possible in this way to discover what kinds of states and physical 
entities will be involved if one tries to represent in the present all 
the information needed to predict the future. With the Hamil- 
tonian method, which begins by assuming such a representation, 
it is difficult to suggest modifications of a general kind, for one 
cannot formulate the problem without having a complete repre- 
sentation of the characteristics of the intermediate states, the 
particles involved in interaction, etc. It is quite possible (in the 
author's opinion, it is very likely) that we may discover that in 
nature the relation of past and future is so intimate for short 
durativns that no simple representation of a present may exist. 
In such a case a theory could not find expression jn Hamiltonian 
form. 

An cractly similar analysis can be made just as easily starting 
with the gencrai forms (46), (48). Also a coordinate representation 
of the photons could have been used instead of the familiar 
momentum one. One can deduce the rules (60), (61), Nothing 
essentially different is involved physically, however, so we shall 
not pursue the subject further here. Since they imply? all the 
rules for real photons, Eqs. (46), (47), (48) constitute a compact 
statement of all the Jaws of quantum electrodynamics. But they 
give divergent results, Cat the result after charge and mass 
renormalization also be expressed to all orders in e?/fAc in a simple 
way? 


APPENDIX C, DIFFERENTIAL EQUATION FOR 
ELECTRON PROPAGATION 


An altempt has been made to find a differentia! wave equation 
for the propagation of an electron interacting with itself, analogous 
to the Dirac equation, but containing terms representing the 
self-action. Neglecting all effects of closed !oops, one such equation 
has been found, but not much has been done with it. It is reported 
here for whatever value it may have, 

An electron acting upon itself is, from one point of view, a 
complex system of a particle and a field of an indefinite number 
of photons. To find a differential law of propagation of such a 
system we must ask first what quantities known al one instant 
will permil the calculation of these same quantities an instant 
later. Clearly, a knowledge of the position of the particle is not 
enough. We should need to specify: (1) the amplitude that the 
electron is al x and there are no photons in the field, (2) the 
amplitude the electron is al x and there is one photon of such 
and such a kind in the field, (3) the amplitude there are two 
photons, etc That is, a series of functions of ever increasing 
numbers of variables, Following this view, we shall! be Jed to 
the wave equation of the theory of second quantization. 

We may also take a different view, Suppose we know a quantily 
,2[B, x], a spinor function of xy, and functional of B,(1), defined 
as the amplitude that an electron arrives al x, with no photon in 
the field shen it moves in an arbitrary exlernal unquantized 
Polentia! By(1). We allow the electron also to interact with itself, 


but $,2 is the amplilude at a given instant that there happens 
to be no photons present. As we have seen, a complete knowledge 
of this functional! will also tel! us the amplitude that the electron 
arrives at x and there is just one photon, of form A,?“(1) present. 
It is, from (60), (8? 2[B, xJ/5By(1)) Ap! (1)dr}. 

Higher numbers of photons correspond to higher functional 
derivatives of &.2, Therefore, 2B, x] contains all the informa- 
tion requisite for describing the state of the electron-photon 
system, and we may expect lo find a differential equation for it, 
Actually it satisfies (V=7,0/d1,, B= -y,B,), 


(iV ~m)2(B, xJ=B(x),2[B, x] 
tidy, ['5.(se°) 620B, x1/8B,(1))dr1 (IC) 


as may be scen from a physical argument.?5 The operator (:V~ m) 
operating on the x coordinale of @,2 should equal, from Dirac’s 
equation, the changes in @,2 as we go from one position x to a 
neighboring position due to the action of vector potentials. The 
term B(x)®,2 is the effect of the external potential. Bul ®.2 may 


28 Its general validity can also be demonstrated mathematically 
from (45). The amplitude for arriving at x with no photons in the 
field with virtual photon coupling e? is a traisition amplitude, 
It must, therefore, satisfy (45) with 7,20B]=,2(B, x] for any x. 
Hence show that the quantily 


CCB, x)= (iv—~m— B(x)) 2B, x] 
5, (s217) (8,2(B, x]/5B,(1))dry 


also satisfies Eq. (45) by substituting C.2[B, x] for T.2[B] in 
(45) and using the fact that ,2[B, x] satisfies (45). Hence if 
CoB, x] =0 then C.2(B, x] =0 for all &, But C.20B, x] =0 means 
that ©,2[B, x] satisfies (IC). Therefore, that solution &2(B, x] 
of (45) which also satisfies ((9— m— B(x) toCB, x] =0 (the propa- 

ation of a free electron without virtual protons) is a solution of 

1C) as we wished to show, Equation ye) may be more con- 
venient than (45) for some purposes for it does nat involve 
differentiation with respect to the coupling constant, and is more 
analogous to a wave equation. 
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also change for at the first position x we may have had a photon 
present (amplitude that it was emilted at another point 1 is 
$@,2/5B,(1)) which was absorbed al x (amplitude photon released 
at I gets lo x is 6,(s212) where sz;7 is the squared invariant distance 
from I to x) acling as a veclor potential there (factor y,). Effects 
of vacuum polarization are eft out. 

Expansion of the solution of ({C) in a power series in B and & 
slarling from a free particle sojution for a single electron, produces 
a series of terms which agree with the rules of II for action of 
potentials and virtual photons to various orders, [It is another 
matter to use such an equation for the practical solution of a 
problem to all orders in e. It might be possible to represent the 
self-energy Problem as the variational problem for m, stemming 
from (1C). The 6, will first have to be modified to obtain a 
convergent result, 

We are not in need of the general solution of (IC). (In fact, 
we have it in (46), (48) in terms of the solution 7o(BJ=.(B, x] 
of the ordinary Dirac equation (iV—m)d.(B, x]=B.(B, x]. 
The general solution is too complicated, for complete knowledge 
of the motion of a self-acting electron in an arbitrary potential is 
essentially al! of electrodynamics (because of the kind of relation 
of real and virtua! processes discussed for photons in Appendix B, 
extended also to real and virtual pairs), Furthermore, it 1s casy 
lo see that other quantities also satisfy (1C), Consider a system 
of many electrons, and single out some one for consideration, 
supposing all the others go from some definite initial state 7 to 
some definite final state f, Let &,2[B, x] be the amplitude that 
the special electron afrives at x, there are no photons present, 
and the other electrons go from i to f when there is an external 
potential B, present (which By, also acts on the other clectrons). 
Then %,2 also satisfies (1C). Likewise the amplitude with closed 
loops (all other electrons go vacuum to vacuum) also satisfies 
(1C) including all vacuum polarization effects. The various 
problems correspond to different assump tons as to the depenilence 
of ,2(B, x] on By, in the limit of zero e. The Eq, (1C) without 
further boundary conditions is probably too general to be useful, 
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An alteration in the notation used to indicate the order of operation of noncommuting quantities is sug- 
gested. Instead of the order being defined by the position on the paper, an ordering subscript is introduced 
so that A,B, means AB or BA depending on whether s exceeds s‘ or vice versa, Then A, can be handled as 
though it were an ordinary numerical function of s. An increase in ease of manipulating some operator 
expressions results, Connection to the theory of functionals is discussed in an appendix, Illustrative appli- 
cations to quantum mechanics are made. In quantum electrodynamics it permits a simple formal under- 
standing of the interrelation of the various present day theoretical formulations. 

The operator expression of the Dirac equation is related to the author’s previous description of positrons, 
An attempt is made to interpret the operator ordering parameter in this case as a fifth coordinate variable 
in an extended Dirac equation. Fock’s parametrization, discussed in an appendix, seems to be easier to 
interpret. 

In the last section a summary of the numerical constants appearing in formulas for transition prob- 
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abilities is given. 


N this paper we suggest an alteration in the mathe- 
matical notation for handling operators. This new 
notation permits a considerable increase in the ease of 
manipulation of complicated expressions involving 
operators. No results which are new are obtained in 
this way, but it does permit one to relate various 
formulas of operator algebra in quantum mechanics in 
a simpler manner than is often available. In particular, 
it is applied to quantum electrodynamics to permit an 
easier way of seeing the relationships among the conven- 
tional formulations, that of Schwinger and Tomanaga,’ 
and that of the author.’ These relationships have already 
been discussed by many people, particularly Dyson.’ 
The connection was shown by means of a re-ordering of 
operators in each term of a perturbation power series. 
Here, the same end is achieved in much the same way 
without having to resort to such an expansion. 

It is felt, in the face of daily experimental surprises 
for meson theory, that it might be worth while to spend 
one’s time expressing electrodynamics in every physical 
and mathematical way possible. There may be some 
hope that a thorough understanding of electrodynamics 
might give a clue as to the possible structure of the 
more complete theory to which it is an approximation. 
This is one reason that this paper is published, even 
though it is little more than a mathematical re-expres- 
sion of old material. A second reason is the desire to 
describe a mathematical method which_may_be_useful 
in other fields. 

The mathematics is not completely satisfactory. No 
attempt has been made to maintain mathematical rigor. 


* Absent on leave at the University of Brazil, Rio de Janeiro, 
Brazil, 

1 See J. Schwinger, Phys. Rev. 76, 790 (1949), and S. Tomonaga, 
Phys. Rev. 74, 224 (1948), where additional references to previous 
work may be found. 

2 The author’s previous papers will hereafter be designated as 
follows: R. P, Feynman, Revs. Modern Phys. 20, 367 (1948)—C; 
Phys. Rev. 76, 749 (1949)—I; Phys. Rev. 76, 769 (1949)—II; and 
Phys. Rev. 80, 440 (1950)—III. 

3 F, Dyson, Phys. Rev. 75, 486, 1736 (1949). 


The excuse is not that it is expected that rigorous dem- 
onstrations can be easily supplied. Quite the contrary, 
it is believed that to put the present methods on a 
rigorous basis may be quite a difficult task, beyond the 
abilities of the author. 

The mathematical ideas are described and are illus- 
trated with simple applications to quantum mechanics, 
in the first four sections. Some possible mathematical 
relations between the operator calculus described here 
and the theory of functionals is described in Appendix 
A, with further specific mathematical applications in 
Appendixes B and C. Section 5, and more particularly 
Secs. 6 to 9, apply specifically to quantum electro- 
dynamics and may be omitted without loss by those 
whose interest is limited to mathematical questions. 
The use of a fifth variable to parametrize the Dirac 
equation is discussed in Secs. 8 and 9. An alternative 
procedure due to V. Fock* appears in Appendix D. 
Section 10 gives a summary of the rules for computing 
matrix elements. 


1. DESCRIPTION OF THE NOTATION 


The order of operation of operators is conventionally 
represented by the position in which the operators are 
written on the paper. Thus, the product AB of twc 
operators A and B is to be distinguished from the 
product in reverse order BA. The algebra of operator: 
is noncommutative, so that all of the ordinary algebra, 
calculus, and analysis with ordinary numbers becomes 
of small utility for operators. Thus, for a single operator, 
a, ordinary functions of this operator, such as A =expa. 
can be defined, for example, by power series. These 
functions obey the rules of ordinary analysis even 
though a is an operator. But if another operator 8 is 
introduced with which a does not commute, the ques- 
tion of functions of the two variables a, 8 is beset with 
commutation difficulties and the simplest theorems ot 
analysis are lost. For example, if B=expf, it is not true 


4V. Fock, Physik. Z. Sowjetunion 12, 404 (1937), 


108 


AN OPERATOR CALCULUS 


that BA, that is, expSexpa, is equal to exp(8t+a). 
Thus, the law of addition of exponents fails. Conse- 
sequently, the principles of elementary calculus are no 
jonger operative in a simple way. For example, expand 
exp(a+ 8) to first order in 8, assuming B small. The 
zero-order term is expa, but the first-order term is 
neither 8 expe nor (expa)f nor the average of the two. 
From the theory of time-dependent perturbations in 
quantum mechanics we learn that it is 


exp(a+ B) 


1 
mexpat f exp((1—s)a JB exp(sa)ds+---. (1) 
0 


The appearance of the integral in this analytic result 
appears surprising and its derivation does not indicate 
clearly how to differentiate or expand other functions 
of a+. Further, the simple integral on s is not easy 
to perform, although the results can be given tn several 
ways as power series. That the integral cannot be done 
in a general fashion is clearly due to a weakness of 
notation, for in a representation in which a is diagonal 
with eigenvalues a, we can of course verify directly the 
usual result, 


vexp(at B) aa (expan) Omn 
+ (expam— expan) Bmn(@m— On) + +++ (2) 


of the perturbation theory of stationary states. 

We shall change the usual notation of the theory of 
operators and indicate the order in which operators are 
to operate by a different device. We attach an index to 
the operator with the rule that the operator with higher 
index operates later. Thus, BA may be written B,Ao 
or AgB,. The order no longer depends on the position 
on the paper, so that all of the ordinary processes of 
analysis may be applied as though A» and B, were 
commuting numbers. It is only at the end of a calcula- 
tion, when the quantities are to be interpreted as 
operators, that the indices 0 and 1 are of importance if 
one wishes to reconvert an expression to the usual 
notation. Thus, if d=expa and B=expf, we can now 
safely write BA =exp(ao+ f1), as there is only one way 
to interpret the latter expression, Other analytic 
processes then become valid. For example, 


1 
€xp(ao+ Br) = Ee a er race oe 


=l4atB8+3(a?4 2Bat B?)+--- 
in the conventional notation. For on squaring 


(aot B1)?= a0? + 2a081 +BY, 


we must interpret the quantity oof as Ba in accordance 
with our convention. The quantity B,° alone (that is, 
not multiplied by any other expression with an index, 
such as a) is simply 6%, since the index is no longer 
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necessary to define the order of operations, there being 
only one operator in the term. 

The notation is to be extended so that the index need 
not be integral, for example, A_,83.2=BA, since 
3.1>—4, and in general A,B, = BA if s’>s and AB if 
s>s’ and is undefined if s=s’. 

How can we work with an expression such as 
exp(a+ 8) so as to free the a and @ of their noncom- 
mutative aspects and thus utilize the theory of functions 
for rearranging the expressions? Take a quantity .V very 
large and write 


y 


1 x 1 
explactA)~| exp—(act 8) | ~[1+ ors] 


1 1 
-[14+<2+8) | 1+ 0048] : 
NX N 


1 
x[ 14 —le+2)] for V factors. 


In each factor we replace a+ 8 by a,+8,, where i is an 
index running to .V, and write 


N 1 
exp(a+ 8) = lim | 1+—(a.+6)| 
i 


Nov i=l 


1% 
= lim eso| — >» (a+8) 


Nox isl 


where the last expression is written in accordance with 
the new convention that the index 7 controls the order 
of operation. (The ambiguity artsing from a, and 6; 
with the same index can only: cause trouble in a product 
aif, and such products are of vanishing importance as 
Nc.) More simply, calling s=7/, we can take the 
limit and write 


enter =e f 


That this is valid is, of course, evident, since we could 
call a,+8,=~y, with y a definite operator operating at 
order s, so that 


1 l 
ex( f vas)=esp( f vs), 
0 0 


for in this expression the order index is unnecessary, 
only one operator y being involved. The integral is 
just y, 


(art Bods] 3) 


1 1 
f ds f ds=¥, 
ct) tt) 


since - does not now depend on s. Therefore, Eq. (3) 
is trivial as it stands; but what is not trivial is the fact 
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that the right-hand side of Eq- (3) may be manipulated 
just as though a, and 8, were numerical functions of s, 
with the assurance that now the order of operations 
will always be automatically specified by the index. For 
example, from Eq. (3) we have the legitimate relation 


explact 8) exo f auds ) eo( f pas), (4) 


As an example, showing that such manipulations do not 
destroy the validity of equations, consider the term of 
first order in both a and f on both sides of Eq. (4). 
Expanding the left side as 14+ (a+ 8)+4(at+8)?+---, 
we see that the term in question is $(a8+ 8a), while 
expansion of the right side gives for the corresponding 
term (Jo! ads)( fo! ds). This can be simplified by 
being written as 


1 1 1 8 1 ] 
f f a,dsB,ds’ = i, f abu dsds'+ f f a;Bsdsds’. 
0 0 0 0 0 8 


In the first integral we have s>s’, so that a,8,- is equal 
to af, while in the second s<s’, soit is Ba. Hence, there 
results af fol fo%dsds’+ Ba fo f;dsds’; thus on_per- 
forming the integrations we find finally 


(f | oo) ( f ads) = (ab+ Ba). 


This process of rearranging the form of expressions 
involving operators ordered by indices so that they 
may be written in conventional form we shall call 
disentangling the operators. The process is not always 
easy to perform and, in fact, is the central problem 
of this operator calculus. As a second example of disen- 
tangling, consider the problem of expanding Eq. (4) 
to the first order in 8. It is evidently 


1 1 
exp(ar+6)=exn{ [ aids )+exp( f ats") 


[past (5) 


The first term is simply expa, for a, is independent of s, 
as there is no other operator with which a, does not 
commute in this term. The next is the integral over s 
of exp(/o! a,-ds’)8,. In the integral on s’ we can split 
the range, according to whether s’<s or s’>s: 
exp( JZ) ads’) exp( fo! asds’)Bs. The a, in the first 
factor acts after the 8, and is otherwise independent of 
s’, while the a, in the second factor is to act before 
the B,. Hence, if we write these factors respectively 
after and before the 8, and imply the usual convention, 
the a,, will be independent of s’ in the range 0 to s and 
we may perform the integral. Hence, the result is the 
integral on s of exp[(1—s)a]8 exp(sa) in agreement 
with Eq. (1). 
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Incidentally, by applying new subscripts in another 
way the term may be also written as fo! exp[(1—s)a»]f; 
Xexp(saojds, in which case the integral may be im- 
mediately performed to give 


[(d/de) exp(ot €8) Jeo 


1 1 
~e( f cuts’) f B.ds 
0 0 


L 


-f exp[(1—s)a]8 exp(sa)ds 


= (expa2—expao)(a2—ao)'B1. (6) 


All the four expressions are equivalent as has been 
shown, but only the first and third are in a form in 
which the operators are disentangled so that the con- 
ventional expressions may be used. In the representa- 
tion in which @ is (diagonal, it should be evident that 
the matrix element of the last expression is that given 
in Eq. (2) 

Any operator function of a+ 8 can, by replacing 
a+ B by fo! ads+ fo! B.ds, be manipulated in a mani- 
fold of ways, many of which lead to useful formulas. 
Ina like manner, more complicated operator expressions 
can be rewritten using ordering indices. They may then 
be manipulated using all of the results of ordinary 
analysis. 

A word about notation : Inasmuch as in mathematics 
and physics there are already many uses of the sub- 
script notation, very often we shall write a(s) for a,. 
In a sense, a(s) is a function of s, namely, in the sense 
that although the operator a may be definite, its order 
of operation is not--so that the operator plus a pre- 
scription of where it is to operate, a(s), isa function of s. 
Furthermore, there will be many cases in which the 
operator actually depends explicitly on the parameter 
of order. In this case we should have strictly to write 
a,.(s) but will omit the subscript when no ambiguity 
will result from the change. 

We may remark in a general sense about the mathe- 
matical character of our expressions. Given an ex- 
pression such as fj! B(s)ds, we are not concerned with 
evaluating the integral, for the quantity when separated 
from other factors with which it might be multiplied is 
incompletely defined. Thus, although /,! B,ds standing 
alone is equivalent simply to 8, this is far from true 
when Jo! 6.ds is multiplied by other expressions such as 
exp fo! ads. Thus, we must consider the complete 
expression as a complete functional of the argument 
functions a(s), B(s), etc. With each such functional we 
are endeavoring to associate an operator. The operator 
depends on the functional in a complex way (the 
operator is a functional of a functional) so that, for 
example, the operator corresponding to the product of 
two functionals is not (in general) the simple product 
of the operators corresponding to the separate factors. 
(The corresponding statement equating the sum of two 


AN 


functionals and the sum of the corresponding operators 
is true, however.) Hence, we can consider the most 
complex expressions involving a number of operators 
M, V, as described by functionals FM(s), V(s)- ++] of 
the argument functions M(s), N(s)---(=M,, Nu: +). 
For each functional we are to find the corresponding 
operator in some simple form;' that is, we wish to disen- 
tangle the functional. One fact we know is that any 
analytic rearrangement may be performed which leaves 
the value of the functional unchanged for arbitrary 
M(s), N(s)--- considered as ordinary numerical func- 
tions. Besides, there are a few special operations which 
we may perform on F’[M(s), N(s)--+], to disentangle 
the expressions, which are valid only because the 
functional does represent an operator according to our 
rules. These special operations (such as extracting an 
exponential factor discussed in Sec. 3) are, of course, 
proper to the new calculus; and our powers of analysis 
in this field will increase as we develop more of them. 


2. APPLICATIONS IN QUANTUM MECHANICS 


The wave equation idy/dt= Hy determines the wave 
function w(/2) at time é, in terms of that at time 4, y(é1). 
In fact, they are related by a unitary transformation 
o(to)= Q(ta, (4s). The unitary operator Q(t, 1) can 
he expressed as Q(#2, t:)=exp(—i(t2—41)/F) in the case 
that ZZ is independent of the time. In spite of the 
simple appearance of the analytic form of Q in terms 
of H, little has been done except formally with this 
expression for the reasons outlined in the previous 
section. We may readily re-express it as 


2 
ts, 1)=ex ag f it) 
ty 


and may then find the expression easy to utilize. 
Further, if /f is an explicit function of the time //(é), 
wwe can consider the 2 to be developed as a large number 
of small unitary transformations in succession, so that 
we have directly 


QXUey, 1)=esp| -i ; i1aiat] (7) 


Hereafter in this section we shall make the convention 
that time is the ordering parameter and simply write 
A(t) for Hi(2). 

We can use this expression to derive many results in 
quantum theory. Thus, if (é) can be written as the 
sun of two parts /f©)(f) and U(t), we have 


tg 


QO(A, w=exp(—i f ; 1 (pdt ) 
xexp(—if u(t. (8) 


* This point of view is discussed in further detail in Appendix A. 
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If H® is simple and U is small, an expansion in powers 
of U is simple. We call Q the operator corresponding 
to the hamiltonian H+ U and Q that corresponding 
to H®). The first-order difference of Qand Q) is 


te te 
-if U(édt exo] ~i ny} 
ey ty 


which may be disentangled as 


a f *exp| ~i f : eat |U( exo] 5 f (at a 


= if" Q (to, NU (QO (L, t)dt, (9) 


ty 


as explained in connection with Eq. (6). This is a 
standard result of time-dependent perturbation theory. 

As a second example consider the perturbation term 
of first order in U and in V arising from the hamiltonian 


HO(@)+U()+V (2). It is 


tg te 2 
= i U(t/)at! f V (edt esp] =i f an(oat| 
ty ty 4 


In order to disentangle this, we can break the 2” 
integral into two regions, ¢’’<?’ and ?’’>?’. The term 
arising from the first region has V operating before U, 
while the reverse is true for the other region. (The 
integral on ¢ for each region is then divided into three 
parts determined by the relation of ¢ to /’, ¢’’.) Thus, 
the term becomes, when disentangled, the sum of two 
terms: 


to (4 
ay f QO, LTE) QOL, BYE") 
tt t 
K QO", ty)de’d’, 


te te 
=f f QO (te, EYE NQOE", LU) 
th , 
; XK QO, t:)dt’dt’. 


This is the way that the various terms corresponding to 
the different diagrams arise in quantum electrodynamics 
when an attempt is made to calculate explicitly a single 
operator expression arising in perturbation theory. 

The results here are very similar to those derived 
from the lagrangian form of quantum mechanics as in 
III. Here we have the advantage of being able to use 
the more familiar operator concepts and to work in 
greater generality from the start. For it is not necessary 
that H be restricted to coordinate and momentum 
operators only. Equations (7) and (8) are correct for 
any H; for example, one containing creation and anni- 
hilation operators of second quantization, or Dirac 
matrices, etc. 

The connection of these formulas to those given in I 
is simple. K (ae, fo; 41, f1) IS just a coordinate integral 
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kernel representation of the operator Q(t, ¢;) so that, 
for example, Eq. (9) gives directly 


fg 
= f K(x, tos x, )U (x, )K (a, t; 1, ty dxdt, 
ty 


the expression (9) of I, while Eq. (10) translates immedi- 
ately into the expression (30) of III. 

As another type of application, consider two inter- 
acting systems whose hamiltonian is H‘)+4+H) 
+U[x™, x], where H) involves operators of 
system (a) only, H involves only those of system (6), 
and U involves both. Then we may ask for the ampli- 
tude, if at ¢; system (a) is in state yi and system (8) in 
gi, that at ¢& they are in Wo, de. This is the matrix 


element 
te t2 
m= Coser en(—if Henna i f H®)(t)dt 
a t 
te 


-i 
a 


us, x06. (14) 


But this may be split into two problems. We may first 
find the matrix element 


te 
ep (—if H)(t)dt 
ty 


-if Ux )(4), x (t) Jdt 


1 


Tex) }= (ve 


vs) (12) 


for the system (a) alone, considering that in the inter- 
action potential U[_x«), «@)], all operators referring to 
(b) are arbitrary numerical functions of t. (We have 
been writing as though U depends on () only through 
the coordinate, x); but the same method applies if it 
is also a function of momentum, or spin, or other 
operators on system (b).) Then the matrix element 7“) 
depends on the function x(t), As we indicate, it is a 
functional of x()(¢). The final answer, m, is then a 
matrix element (¢2|M| 1) between the states $1 and ¢2; 


mat ¢ 


wherein now the quantities x(t) are considered as 
ordered operators operating relative to each other and 
to H®)(¢) in accordance with the time parametrization. 

In this way we can analyze one part of a pair of 
interacting systems without having yet analyzed the 
other. The influence of @ on } is completely contained in 
the operator functional 7[x)(t)]. This separation 
may be useful in analysis of the theory of measurement 
and of quantum statistical mechanics. It is the possi- 
bility of such a separation which exists also in the case? 
of the lagrangian form of quantum mechanics, C, 


ep(-if 2a) rote] 


), (13) 
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which makes that form useful in analyzing the quantum 
properties of the electromagnetic field. We may there- 
fore expect that with the present operator notation it 
should be equally easy to make this analysis. That this 
is indeed true we show by example further on. Since 
this, the main advantage of the lagrangian form, can be 
so easily managed with the new notation for operators, 
this may well take the place of the lagrangian form in 
many applications. It is in some ways a more powerful 
and general form than the lagrangian. It is not restricted 
to the nonrelativistic mechanics in any way. A possible 
advantage of the other form at present might be a 
slight increase in anschaulichkeit offered for the inter- 
pretation of the nonrelativistic quantum mechanics. 


3. DISENTANGLING AN EXPERIMENTAL FACTOR 


There is one theorem which is very useful in disen- 
tangling operator expressions. We shall give it in this 
section. Suppose we have several operators M, N, etc. 
(which may also be functions of time, or more generally, 
the ordering parameter s), which are ordered in some 
way. 

Let us say the functional F[M(s), N(s)--+] defines 
the ordered operator. Now suppose we replace M(s) by 
M'(s)=U4M(s)U, N(s) by N’=U-NU, etc., where 
U is some constant operator. Then, as is well known, 
in FLM(s), N(s):--] in any product of successive 
operators, such as M(s+ds)N(s), the VU~ cancel out 
in between (that is, MN=U7~MUUoONU=U—MNU, 
etc.), so that there results 


F[M"(s), N"(s)---J=U“F[M(s), N(s)-+- JU, 


where the U’s are written to operate in the correct 
order. (If we wish to be more specific, we can imagine 
the range of the ordering parameter to be s=0 to 1 
and write the right-hand side as U;*FU 4.) 

This is a simple rewriting of a well-known theorem 
of equivalence transformations. However, a much more 
interesting case is that in which U(s) is actually a func- 
tion of the ordering parameter. That is, we contemplate 
performing different transformations on the operators 
M(s) depending on the value of s at which they are to 
operate. Then in a product of successive operators such 
as M'(s+ds)N’(s), where 


M'(s+ds) = U-*(s+ds)M (s+ ds)U(s+ds) 


(14) 


(operating in the order indicated by the position of U—, 
M and U) and N’(s)\=U~(s)N(s)U(s), the factors 
U(s+ds) and U-(s) will not cancel out, but we will 
find the operator U(s+ds)U~(s) operating between 
times s and s+ds, say at s+4ds. If we assume U con- 
tinuous, we can imagine U(s) differs from U(s+ds) to 
the first order in ds, and hence that U(s+ds)U~(s) 
equals to first order in ds: 


U(s+ds)U—-(s)=1+P(s)ds, 


where P(s) is an operator defined by this relation in the 
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limit ds—+0. We may write this relation 
dU (s)/ds= P(s)U(s) (15) 


with positional ordering. Hence, between s and s+ds 
there should operate an additional factor 1+ P(s)ds, 
which for convenience we may write, valid to first order, 
as exp[P(s+4ds)ds]. The s+4ds in P(s+3ds) will 
automatically locate the factor in the correct order. 
But there is a factor of this kind appearing between the 
operators for each value of s, or multiplying the factors 
all together, we obtain the net factor exp /o!P(s)ds, 
the product becoming a sum, or integral in the ex- 
ponent. Hence, we have the general theorem: 


F[M'(s), N'(s)--- J=U~(1)F[M(s), N(s)-- J 


xexp( f P(s)ds )U(), 


M'(s)=U-*(s)M(s)U(s) 


where 


(16) 


and 


vo=ea] f P(s)as'|UO), (17) 


this last coming from integrating Eq. (15). We shall use 
the theorem by writing it in the form 


exp f " P(s)dsFCM(s), N(s) ++] 


= U(1)F[M"(s), N"(s)---JU-1(0), (18) 
in which form it serves as a rule for disentangling an 
exponential factor from another expression. A word of 
caution is necessary in reading Eqs, (18), (16), and (17), 
for three different notations are used in the expressions. 
In Eq, (18) the new ordered notation is used in its 
complete form; for example, the s in exp Jo! P(s)ds 
gives the order in which the P is to operate relative to 
the M, N of the functional F which it multiplies. In 
Eq. (16), however, all the operators are to operate at s, 
but the relative order in M’ of U, M, and U— is as 
given by the usual position convention. Finally, Eq. 
(17) would be less ambiguous if it were replaced by the 
differential equation (15). For in the solution (17), the s’ 
are to bear no relation to the s in Eq. (16) or Eq. (18). 
The operator U(s) is to be computed from P by Eq. 
(17) first, then the whole operator U(s) is to operate in 
Eq. (16), and then in Eq. (18) at the position s. 

We shall use this theorem in several applications 
telated to quantum electrodynamics. Most particularly, 
we shall find a certain special case useful enough to 
warrant special mention, It is the case that P(s) is of 
the form a(s)P,, where a(s) is a simple numerical 
function, and P, is an operator whose form does not 
depend on s but whose order of operation does, Then 
if we call a(s)= fi! a(s’)ds’, so that a(s) is also a nu- 
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merical function, Eq. (17) gives U(s)=exp[a(s)P, JU(0). 
We shall further choose to specialize U(0)=1. (The 
more general case corresponds to a final simple constant 
equivalence transformation (14) with U(0).) Then our 
theorem may be written 


1 


o(s) Pas JCMS), N(s)-++] 


= exo| P f 
ry 


1 


a(s)ds PLM), N"(s) 5 |, (19) 


where 


u'@)=ex| —P, f alsa! | 


xexp| +P, f a(s')s' | (20) 


Further, since this theorem with Eq. (20) substituted 
into Eq. (19) is valid when a(s) is an arbitrary numerical 
function, it is also true if a(s) is any ordered operator 
a(s) commuting with P for all s, provided that in all 
expressions involving a, the parameter s or s’ is con- 
sistently interpreted as giving the order in which the 
a operates.® 

The mathematical proof of the theorems (18) and 
(19) offered here is admittedly very sketchy; but since 
the theorems are true, it should not be hard to supply 
them with more satisfactory demonstrations (see 
Appendix A for an alternative demonstration). 

There are a number of other interesting relations 
which we may derive from Eq. (19), but which we shall 
not need in this paper. One is included here because it 
has been found useful in certain other applications. If 
a(s) is considered infinitesimal in Eqs. (19) and (20), 
expansion in first order in & gives the following result 
(or differentiate each side with respect to a(t) and set 
a(s)=0), 


P.F[M(s)]=P\F[M(s)]— f (PM—MP),6F/6M(s)ds 


(Gif we assume F can be represented by a functional 
having a derivative 6F/5M(s)). We have taken F to 
depend only on one operator M(s), but the generaliza- 
tion is clear. Here, (PM — M P), is conventional ordering 
is PM(s)— M(s)P and is considered to act as an entity 
at s. The differential form 


(dP,/dt)F[M(s) ]=(PM—MP) dF /6M (0) 


is also useful. 


(21) 


£ To simplify such descriptions, in a situation involving two sets 
of operators, any one of the first set commuting with any one of 
the second, it is often convenient to generalize to the use of two 
different ordering parameters—one for first set, and one for the 
second. 
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4. THE INTERACTION REPRESENTATION 


As a first simple direct application of our theorem 
consider again the perturbation problem (8) of com- 
puting the operator 


QY) (to, i) 


exp] i f : H o(oat| exo] =i f : uioat| (8) 


If we suppose the properties of H“) to be known and 
simple, the right side of Eq. (8) may be disentangled by 
means of our theorem (18). We consider —iH)(t) as 
an operator P(s) and 


eso| _ if 
ty 


as the functional F from which the 


exo] if Hat] 


is to be disentangled. Hence, a direct application of Eq. 
(18) gives 


en{ if wor(pat| ex[ if u(oat| 


= S(ta) ef -if u'(oar}s-e), (22) 


si)=en|-if 10 (a 


(the lower limit @ used in defining S is arbitrary ; it may 
be taken as ¢; so S(t,)=1 if that is convenient), and 


U'()= SAHQUOSO (23) 


t2 


u(at| 


where 


(operating in positional order). If we take matrix 
elements not between states y; and yo but between 
vy = SHA) and Yo’ = S(to) po, we may call the 
Q-matrix Q’ and omit the S(é2) and S(t) factors in 
Eq. (22). These new time-dependent states y’ are 
evidently states that would give rise to y; at 4, and yz 
at 4, (from some fixed reference time a) if the perturba- 
tion were not acting. Then the time-dependent per- 
turbation theory simply comes to evaluating 


Q' (te, wymen(-if U'(a'). 


Expansion in power series, substitution of U’ from Eq. 
(23), and use of the relation QM(2’, ’)=S(#) S(t) 
leads immediately to the formulas (9) and (10), so that 
Eqs. (23) and (24) give the simplest form to the time- 
dependent perturbation theory. Of course, the same 


(24) 
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results may be obtained by a unitary transformation in 
the conventional way. Ordinarily, result (24) is not 
written in this way, for it involves the time convention 
on the ordering of the operators. (It is usually expressed 
as a differential equation for ’.) If the perturbation U 
represents an interaction between some systems de- 
scribed by H), the reduction of Eq. (8) to Eq. (24) 
is called passing to the interaction representation. 


5. SYSTEM COUPLED TO AN HARMONIC OSCILLATOR 


As a further example of the use of the notation we 
solve completely the problem of a particle or system of 
particles coupled linearly to an harmonic oscillator, 
This problem in greater generality is the main problem 
of quantum electrodynamics. It has been thoroughly 
studied in III, but we solve it again as an illustration 
of the new notation. Let the hamiltonian of the com- 
bined system be 


I= H,(t)+ Hose— V'(A)q, 
where J/,s_ is the hamiltonian of the oscillator alone, 
Tose= (1/2m)(p?+ w¢?), 


where p is the momentum conjugate to g, the coordinate 
of the oscillator. Further, H,, which may depend ex- 
plicitly on time, is the hamiltonian of the particles, and 
T’ may contain any: operators pertaining to the particles 
as well as possibly being an explicit function of the 
time. We ask for the matrix element for finding the 
particles in state x, and the oscillator in some eigen- 
state m at time /”, if at a previous time /’ the particles 
are in state x, and the oscillator in its mth eigenstate. 
It is the matrix element 

boxe) 


Ga, 


using the time ordering convention. As already dis- 
cussed in Sec. 2, this can be considered as the matrix 
element between states xz and y+ of the matrix 


M=exp(—if 
v 


where Grn (the analog of T of Sec. 2), a functional of 
T'(é), serves to define the net effect on the particles of 
their interaction with the oscillator. Calculation of Gan 
means evaluating 


Grn= (6m exp~i f 


t 


exp—i f LIE, ()+ Hose(t)— 1 (OD g() dt 


we 


Hat) Gam (25) 


TI pqe(t) dt 


exif T(é)q(é)dé 


fa 


6.) (26) 


in a general way as a functional of I'(é). We are to 
consider I'(#) here as a simple numerical function, and 
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only later utilize the fact that it is an operator involving 
the particles when we go to evaluate the matrix M, 
Eq. (25), between the particle states x and xe. The 
evaluation of Gn» for an arbitrary numerical function 
I'(é) may be performed in a variety of ways. One is by 
the lagrangian form of quantum mechanics given 
explicitly in TIT, Sec. 3, with the sole difference (which 
is unessential for this part of the problem) that there 
I'(¢) was called y(f) and was a functional of the coor- 
(linates x(¢) of the particles, while here we see we are in 
a more general position as I'(t) may be a functional of 
any ordered operators referring to the particles. We 
find, for example [III, Eq. (14)], 


Gu-esp| - aan ff in nen eae 


xriore)aids} (27) 


The same result may also be obtained by a direct solu- 
tion of the Schrédinger equation for the forced oscillator. 
The great advantage of the operator notation is to allow 
this formal solution for an oscillator forced by an 
arbitrary potential fiunction T(t) to be equally useful 
when the oscillator is actually in interaction with a 
quantum-mechanical system! 

Thus, we have the answer for Gmn in III, Eq. (57), 
using I' for y. It is, however, interesting to see how this 
expression for Gm» could be worked out directly using 
the methods of the ordered operator calculus. We want 
to disentangle the operator 


7 wr 
o=exp(-if Hat em(if raat). (26’) 
v tv 


ae us call, in the usual way, O*= (3w)}(qg—éw7"p) and 
=(5 w) (qt ie~ ‘p) the creation and annihilation 
ee They satisfy the commutation relation 


00*-0*0=1. (28) 
In terms of them /7o..= 4w(Q*0+QQ*) and 
q= (2w)-(O+0*). (29) 


Now as a first step we pass to the interaction representa- 
tion (Sec. 4). We use the theorem (20) with P,= —iHose, 
a(s)=1, to disentangle the exp —7/ Hosc(é)dé] factor, 
obtaining 


G=St")exp| io) f POLE’ +O se), 


where O'()=S“(OOSO, QO'*(D=S7(HOQ*SH, and 
S()= exp(— itHose). By redefining the wave functions 
so they contain S(é), or, for eigenstates, the energy 
factors exp(—iE£,¢) for the free oscillator, we can 
eliminate the S(t’) and S-*(t’) and need merely cal- 
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culate the matrix element of 
Gm expitee)-*f POLO’ O+O* (0) ja. 
‘ 
From Eq. (28) we readily calculate that? 
QO'(H=Qe-* and Q’*()=Q*et'*t, — (30) 


so that the problem becomes the disentanglement of 


G'= exp] aay f 


wer 


roeroiat| 


xexrfiQuyf reo. 


v 
We shall find it most convenient to disentangle this into 
a form in which all the annihilation operators operate 
first, and then come the creation operators (since the 
nth state cannot suffer more than n annihilation 
operators, Q"+'g, vanishing, the expression will be 
easy to evaluate in this form). To this end let us use 
theorem (19) again, this time with P=Q*, 


a(s)=7(2w) IT (et @¢, 
Calling, temporarily, 


A() =i(20)-} f rWerde, 


t 
we find 


= exp AW IQe-4] exp] 20-4 f ree o"(oat} 


where 


Q’'(t)=expl-A()O*]O exp[+AMO*], (31) 
The commutation relation (28) here gives’ 
QO") =Or+A(), (32) 


so that 


G'=explAWQe%] 
Xero] (au) f rete aia 


In the last factor Q; can be replaced by Q» since ¢ is in 
any case less than t’’, so that all the Q,’s come before 
the Q,-*, and Q; need not be ordered relative to itself 
as it is a constant operator. Hence, we may, with a 
slight rewriting (for example, 78* for 4(i’’)), write 


G’ = exp(iB*Qi*) exp(iBQr )Goo (33) 


7 For, Q'(t)=exp(t#tHoc)Q exp(— tH o0e) implies dQ’/dt=iS“1(¢) 
X (Hose O— QF cee) S(t) = —iwQ'(t), since HocQ—QHoe= —wQ by 
Eq. (28). Thus, since Q’(0)=Q, one obtains Q’ (=e. 

8 For, differentiation of the expression for O’’(t) gives dQ’’(t) /dt 
= —A'() exp(— A()Q*)(Q*0 —Q0*) exp(+4)O*) = A'@ by Eq. 
(28), so integration gives O”'(1)=A(H4+Q, since QO”) =Q, inas- 
much as A(/’)=0. 
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with Gog equal to expi{2w)—! fy’ Te tA (dt and 
therefore identical to Eq. (27), and with 


B= (2w)-? f T(e—**#dt, 
: (34) 


yer 


g*=(2u)-} f Tderdt, 


just as in III, Eq. (58). The operator G’ is now com- 
pletely disentangled. Its matrix element between and 
m we call Gn, The matrix element may be evaluated 
by ordinary methods, since the ¢’ and ¢’”” in Q and Q*, 
respectively, in Eq. (33) are unnecessary if the posi- 
tional notation is used. That the element for x=0, 
m=0 is just what we call Goo is evident, for if exp(z8Q) 
be expanded as 1+ 60+ 6°Q?- -- and the result applied 
to qo, all the terms beyond the first give zero for 
Q¢os=0. Thus, this exponential may effectively be 
replaced by unity. Likewise, the second can be replaced 
by unity for ot0*=0. 

The case of more general values of m, n, may be 
worked out by writing 


n= (n!)41O* bo, (35) 


so that 
Gmnn™ (ho| (m!) Hn!) 1Ome "Oe BOO **| o)\Goo. (36) 
Then, since e*#°Q* = (Q*+ 18) e#@ (as in Eqs. (31), (32)), 
repetition ” times gives e#0Q*"= (Q*4 if)"e!®, and 
likewise Oe 16*@* = ¢8°0*(()-78*)™, We find 
Gmn= (al (om!) Hin!) “he (+ 18*)™ 
X (Q*+ 18) "€*82| o)Goo. (37) 


The exponentials may now be replaced by unity as 
previously discussed. The other factors expanded by 
the binomial theorem give 


Guam (nly Mn) 15(" C ) iene 


X (48)"-*( 01 Q7O* | $0)Goo. 
The next to last factor by Eq. (35) is 


(sR rl) bs] bs) = (5!) 41!) 48,5, 
so that finally 


Grn (ma in)) 9D ( >) ( ia") "Gog (38) 


as in IIT, Eq. (57). 

Having this form for the behavior of a system of 
particles interacting with a single oscillator, we could 
go on and discuss the quantum electromagnetic field as 
a set of such oscillators. It is evident that to do so would 
be simply to repeat the steps described in ITI, Sec. 4, 
using I for y and reinterpreting the symbols as ordered 
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operators rather than as amplitudes associated with a 
path. The result in general is Eq. (43) below [in agree- 
ment with III, Eq. (48) ], and there is no need to go 
into the details again of summing the effects of all the 
oscillators to obtain this result. We will pass directly 
to a discussion of the complete electromagnetic field. 


6. QUANTUM ELECTRODYNAMICS 


There are available several equivalent formulations 
of quantum electrodynamics.'** We shall give a very 
brief outline of their interrelationships using the ordered 
operator notation. We can start with the usual for- 
malism of Heisenberg, Pauli, and Dirac.2 The wave 
function of the system, consisting of the electron-positron 
field and of the electromagnetic field in interaction, 
satisfies a wave equation :0y/dt=Hy, where the 
hamiltonian for the system may be written H=H,,+ H, 
+H.,, where H,, is that of the electron-positron field free 
of potentials, H; is that of the electromagnetic field in 
empty space, and H; represents the interaction of the 
two fields. The problem is to obtain the wave function 
at time é2 in terms of its value at a previous time ¢;. It is 
therefore a study of the operator 


exp| -if Crorao+ mora), 


We can simplify this by first disentangling the expo- 
nential factor 


exp] if Catat0-+ aon |. 


That is, we go directly to the interaction representation, 
and find that we must analyze 


exp — if ° as(patl, 


We shall always use the interaction representation and 
shall omit the prime here for simplicity of notation. 
Furthermore, it will be sufficient for our purpose to 
consider only the case f;>~ and t,>+, so that 
quantum electrodynamics is a study of the operator” 


Sexo] - ft Gulx, YA,LX, axa} (39) 


—-«~ 


where the A,{1) is the operator potential of the electro- 
magnetic field and j,{1) is the operator current of the 


* See, for example, P, A. M. Dirac, The Principles of Quantum 
Mechanics, (The Clarendon Press, Oxford, 1947), third edition, 
Chapter 12. 

‘© For systems with particles of spin zero or one, 5 may be written 
in this same form by use of the Kemmer-Duffin matrices By, as is 
shown by C. N. Yang and D. Feldman, Phys. Rey. 79, 972 (1950), 
for example, Eq. (33). Thus, all of these results given here for the 
ard field are equally correct for spin zero or one if y, is replaced 

ys» See also M. Neumann and W. H. Furry, Phys. Rev. 76, 
ta *(1949), and R. Moorhouse, Phys. Rev. 76, 1691 (1949). 
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Dirac electron-positron field. They, of course, commute 
with each other, since they refer to different systems. 
Further," A,({1), Ay{2) commute if 1 and 2 are separated 
by a spacelike interval, as do j,{1), 7,{2). In the ex- 
pression for S the operators are ordered in accordance 
with the time 4. 

We may thus define the problem of quantum electro- 
dynamics as a study of the operator S. Let us imagine 
for purposes of discussion that we disregard the deriva- 
tion of S given in the preceding paragraph. We imagine 
the problem is given directly as the analysis of the 
operator S defined in Eq. (39) (assuming the commuta- 
tion rules, reference 11). Let us see how the various 
formalisms are simply different ways of expressing or 
analyzing S. 

First, we might try to define S in some way which 
would not require the use of the ordering notation. 
Suppose we split the range of integration of ¢ into two 
regions — © to 7 and 7 to «. Then the integral may 
be split into two parts. We can write the factors, as 


s=exp| if Gulx, YAylx, pat 


xexn| — if Gulx’, CVA glx’, neat | 


—0 


Now, since ¢’<¢, all the operators on the last factor act 
before those of the first factor, so they are disentangled 
relative to the first factor. Hence, we are led to define 
an operator function of 7, 


ag)=en{ if julx, DAyCx, jaa} 


If r is changed to 7+d7, an additional factor appears 
operating in front of all other ¢<7, namely, 
expl—iArf/j,{x, 7)A,{x, 7)d5x]. Hence, Q{7) satisfies 
the differential equation 


1dQ/dr= | iets T)Ay{x, re Jat), (40) 


the operators operating in positional order. 

Thus, we are lead to a differential equation, the 
solution of which can be used to define S (for S is 
O{r) as 73+ 0 when Q(z) is that solution of Eq. (40) 
which >/ as r>— 0 ). lf we define y{— ~) as an initial 
state wave function, clearly, y{7)=Q(7)y{— ©) satisfies 
the same equation as Q. This is the Schrodinger equation 
in the usual formulation if written in interaction repre- 
sentation. (We probably would not be led to go back 

“See, for example, J. Schwinger, Phys. Rev. 74, 1439 (1948). 


In his notation (except that we put a factor e in Ay rather than 
Js), the commutation relations are [his Eqs. (2.28 )and (2.29) ] 


[Aylx), Ara’) ]=47et6,,D(x— 2x’) 


and ju(x)=0(a) yy) with (pal), vale’)| = —iSaa(z—x') if no 
external potential is acting. Other combinations commute. 
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to the ordinary representation as this is an unnecessary 
increase in complexity.) 

The apparent lack of covariance implied by using 
time to define the differential equation can be remedied 
by analyzing S ina slightly different manner, suggested 
by Tomanaga and by Schwinger.! 

The variables x, ¢ over which one integrates in Eq. 
(39) may be divided into two groups in another way; 
those previous to and those following an arbitrary space- 
like surface o: 


S= exo| - if Gu(x, DAylx, Hara] 


b 


Xeno] if FAX, EVA UX’, naar | 


a 


where the region a of integration of the second factor 
are those points of space-time previous to ¢, while 
are those following o. Now again the factors are disen- 
tangled. It might at first be argued that since there are 
some values of ¢’ greater than ¢, the corresponding 
operators in 2 should follow, not precede, those in ft. 
But for those ¢’ which exceed /, the points x, x’ are 
separated by a spacelike interval (as o is a spacelike 
surface); hence the order of the Ay({x, f) and -1,(x’, ’) 
as well as of 7,{x, 4) and 7,(x’, #’) is irrelevant, as these 
commute. Hence, the operators are, in fact, disen- 
tangled; and we can define 


Q{e) ~exn|-if Gulx, NA, {x, pera | 


as an operator defined as a functional of the surface o. 
A small change in surface at x, ¢ changes the operator by 


5Q(o)/80(x, t) = —i74{x, NAs(x, Q(o), (41) 


the equation of Schwinger” for Q{c) {and also for w(c) 
defined by Q(c)y{— )). Again, S is Q{c) as the surface 
o isremoved to +. 

These differential equations (40) or (41) are therefore 
needed to define the operator S if one is limited to con- 
ventional notation. The form (41) has the advantage 
of putting the relativistic invariance more into evidence. 
However, the solution (39) is common to both and is 
more easily used. It is likewise evidently invariant if we 
write it 


seo] if i04.c0en] (42) 


{with the point 1 representing x,, ¢; and dr;=d*x,dt,) 
and assume the convention here that zf two operators in 
Eq. (42) correspond to points separated by either a time- 
like or a zero interval, that operates first which corresponds 


27, Schwinger. Phys. Rev. 74, 1439 (1948). 
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to the earlier tume. If they are separated by a spacelike 
interval, no definition is necessary, for they commute. 

The other developments consist in methods of 
actually evaluating Eq. (42), given the commutation 
relations" of the A,{1). The method explained by 
Dyson! consists of making a power series expansion of 
S and disentangling it term by term. For example, the 
second-order term is 


me f f(A (dr, f j(2)A,)4r. 


This term may then be analyzed into the conventional 
notation by reordering the operators. In this example 
it is necessary merely to break the region of integration 
in 4, up into two, é2<¢; and é,>4,. Actually, because of 
the symmetry they give equal contributions, so that 
the result is 


ad f f je oby(L)dxrdty f : f joQ)A,(2)d%xodts, 


the ordering now being conventional. From here the 
matrix elements are computed between given states by 
use of the commutation relations (46) below. For further 
details we refer to Dyson’s papers.? The result is that 
given by the rules of IT. 

Another method is to notice that the entire de- 
pendence of S on A, can be directly evaluated. As far 
as the states of the field are concerned, the evaluation 
of matrix elements of S is exactly the same as though 
ju(1) were a numerical function (since it commutes with 
all A,(1)). Hence, these may be worked out by first 
obtaining the result for a field interacting with a given 
unquantized current distribution j,{1). This can be 
done, for example, by using the lagrangian methods 
described in III. For example, the matrix taken between 
states in which the field is empty of photons initially 
and finally is 


Soo= exo] —3e f Jiscrsnea6,6s25¢r.dr] (43) 


as js shown in III (for 7, a numerical function). This 
may now be interpreted as follows: The matrix element 
of S for a transition im which at ¢= —oo there are no 
real photons and the matter is in state x_, to the state 
at + also empty of photons with the matter in state 
x+, is the matrix element of So between x_ and x4, 
where Soo, given in Eq. (43), operates now only on 
matter variables, the order of operators j,({1), 7,(2) 
being determined just as in Eq. (42). This expression 
forms the basis for the author’s treatment of virtual 
photon processes {IT). 

If an additioual unquantized potential B,{1) is 
present, the expression (42) for S is altered just by the 
replacement of A,(1) by 4,(1)+B,(1). 

The matrix corresponding to Eq. (43) would be a 
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functional of B,{1) and a function of é: 


Sac }mern| —He ff 5.04.0)0.(6.%erdr] 


Xero] —i f 508.(D¢r4} (44) 
It is evident by direct substitution, that S.*B] satisfies 


asc/ate aif [#50/8B,)E ME lowMdrtre 


Since the equation is linear, any matrix element of S.:, 
say T.*(B], between two states of the matter satisfies 
same equation, This is Eq. (45) of III, which is shown 
in ITT to be a general statement of the rules given in IT 
for solving electrodynamic problems. Evidently, the 
case of real photons in initial or final state can be carried 
through in parallel to the discussion in ITI, with 7,(1) 
now as an operator, 

This completes our discussion in a general way of the 
relations between the various representations of electro- 
dynamics. However, we wish to add a word concerning 
the derivation of Eq. (43). We have indicated how this 
may be done using the lagrangian method. However, 
we have seen from our example with the single forced 
oscillator that the same results may be obtained directly 
with the operator method, in just as simple a manner. 
Of course, by considering the field as a set of such oscil- 
lators we will arrive at Eq. (43), thus completely avoid- 
ing the lagrangian formulation. However, since the 
relation between Eqs. (42) and (43) is so fundamental, 
we should hke to show how the operator method 
permits a simple direct passage from Eq. (42) to Eq. 
(43).'8 (We are simply following the steps leading from 
Eq. (26’) to Eq. (33) for the single harmonic oscillator, 
but are using 4,{1) to replace q.) 

The field operator 4,{1) can be split into two parts 
A,(1)=4,*Q)+4,7(1), where the first A,+{1) anni- 
hilates photons, and the second 4,7{1) creates them." 
They satisfy the commutation relations (positional 


‘We omit the usual extra complications in all such demon- 
strations concerned with showing that disregard of the supple- 
mentary conditions on 0A,/d2, is legitimate. 

‘4 Ordinarily, the field operator A,(1) is expanded into modes 
Ag(l) = 2A yi(x) (ite! + Q,e 4), where Ayy is the numerical 
function [for example, cosines or sines, III Eq. (1)]] describing the 
classical mode 7 of frequency w; and Q,*, Q; are the creation and 
annihilation operators into which qy, the coordinate of the oscil- 
lator of this mode, has been split (29). The factors e+! result 
from use of interaction representation (G0). Then, we have 
AyD =ZiA gi O%e* and Ayt(L)= 2A pilx)Qe "The 
commutation rule (46) then results from that of the Q and Q* (28). 
Using the representation of III Eq. (1), the right-hand side 
of Eq. (46) comes from Eq. (28) directly in the form 


(29) 75S exp[ — 7k(2—4) ] cos(K- x1 — K- x2)d°K/k, 


which on integration for “>h [see III Eq. (22)] is Eq. (46). The 
separation has been accomplished directly in coordinate space by 
J. Schwinger, Phys. Rev. 75, 651 (1949). 
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ordering) 
A, (1)A,*(2) — A, (2)A, (1) = — 1676584512?) (46) 


for!® t1<to. We have Set $12= (X1y— X2y)(X1p— Xo). 
On the basis of this commutation rule, we are to 
disentangle the operator 


S= en] i [4.0m (dr| 


exp] i f ss 2)Avnr(ayaral (42’) 


where for definiteness we indicate the time of operation 
by the subscript. This is already of the exponential form 
of theorem (18), using —2fj,y(1)Aste—(1)d*x, as P{s), 
s= to. Hence, the result is 


smesp| —if j0)4,0-(er| 


xexr| ~i f Hs2)Aua"*(2)ara} 


where 


Avtg'’*+ (2) = esp| +i f Gs) AyA)dr; ae) 


(2 
xem if UAW ere |, (7 
where in Eq. (47) we suppress the ordering rules for A*, 
A- and use instead positional notation (but maintain 
the rules for j,). 

The commutation rule (46) permits Eq. (47) to be 
written!® 


2 
sloty’’*+(2) = Anst(2)+ f €75.4.(5107)7u(1) ari. (48) 


Hence, we have 


sexp|—if jt1)4,0-(ers| 
xexp| — Bie ff AD jpDIdAdrar| 


ern] if 5.2)4,-0" arn}, (49) 


18 This restriction at first sight looks unrelativistic. For 4. >‘ we 
would have the complex conjugate of — té*5y554(sw2), but —284(s12") 
is real in spacelike regions (as 6,(x) =8(x)—irx7). 

8 For, if A,yso'’t(2) of (47) is considered as a functional of 7, its 
first variation with respect to j,(3) is (<4). 


fexp( +i f" j,A,-de\CAy-G)AY*Q)—AHDAy"@)] 


xexp(—i f i‘ jude) = ebb sou) 


by Eq. (46). The first variation of expression (48) gives the same 
result, so that Eq. (48) is correct for all jy, since it obviously is 
correct for j,=0 
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the A operators being entirely disentangled (the j’s are 
still entangled). The ordering index ¢2 on Ayut,+ has been 
changed to — © in Eq. (49), since all the A+ commute 
and act before A,.._, So that no ordering is necessary. 

Taken between states empty of photons the result is 
just Soo of Eq. (43), for the annihilation operation A+ 
on the state of zero photons is zero, and creation opera- 
tion of A~ has zero amplitude of leaving a state without 
photons. If there is one photon present initially and we 
ask that no photons remain, we shall have to annihilate 
it and create none, so that if the A~ and At exponentials 
are expanded in power series, we must take only the 
term linear in A+ and independent of A-. This is 
equivalent to a first-order action of the potential B, 
in Eq. (44) in perturbation. The corresponding rules 
for higher numbers of real photons are readily derived 
from Eq. (49), In this way we have completed an inde- 
pendent deduction of all the main formal results in 
quantum electrodynamics, by use of the operator 
notation. 


7. THE DIRAC EQUATION 


Up to now we have discussed the matter system 
using the description of second quantization. It was 
pointed out in I in the case of the electron-positron field 
where a small number of charges is involved, another 
simple interpretation is available. In this section we 
should like to discuss this from an operator point of 
view and to give in the following sections the formulas 
in this picture for electrons interacting through the 
agency of the electromagnetic field. 

We begin by discussing the behavior of a single 
charge (plus the virtual pairs produced from it) in an 
unquantized potential B=+7,B,, omitting the con- 
tributions from closed loop diagrams. This section will 
therefore constitute a brief summary of I using operator 
notation. 

The behavior of a single charge is obtained by solving 
the Dirac equation 


(iV— B—m)y=0 (50) 


with suitable boundary conditions and interpreting the 
solution as described in I. For convenience we shall 
always solve, instead, 


(iV— B—m)yp=iF, (51) 
where F is a source function, by writing 
y= (iV— B-—m)-“F (52) 


and interpreting the reciprocal operator in the definite 
sense implied by the limit of the operator when m has 
a vanishingly small negative imaginary part. If, for 
example, we wish the ordinary solution for ¢>¢) which 
at t=) has the form f(x) representing an electron {i.e., 
f(x) has only positive energy components), that solution 
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is the y obtained from Eq. (52) by setting!” F(1) 
= yt5(t1— to) f(x1). If f contains negative energy com- 
ponents, Eq. (52) gives the desired solution for these 
components for ¢<ép. 

From the definition of K,(2, 1) in I, Eq. (15), we 
can write 


K,(2, 1)=(iV— B—m)~i8(2, 1), (53) 


so that —iK,“{2, 1) is the space-time representation 
of the operator (:V—B—m)~' needed for Eq. (52). 
Thus, the solution (52) is 


y(2)= fxm, 1)F(1)dr,. (54) 


The perturbation theory, considering B as a per- 
turbation on the free particle, arises from Eq. (53) from 
a power Series expansion in B. For any pair of operators 
A, B we have 


(A+B)-1= A“! ABA“ AIBA“IBA-1. 
so that with A= (iV—m), B= — B we have 
GV— B—m)1=(1V—m)“+ (iV —m) BV —m)- 
+ (GV—m) BUY —m) BUV—m)-1+- +: 
or in space representation (putting 
Ki.(2, 1)=7aV—m)-16(2, 1)) 


(55) 


(56) 


Ky (2, )=K4(2, =i [ K.0,3)BOK.6, 1)drs 


= f f K.,(2, 4)B(4)K,,(4, 3) B(3)K (3, 1)dridrs, (57) 


as in I, Eqs. (13) and (14). The corresponding mo- 
mentum representation is evident directly from Eq. 
(56), for 4V—m)— is (b—m)-1, 

If F is to represent an initial state, it isalso convenient 
to use the free particle solution f(1)=(2V—m)-“F (1) 
to represent the state. We are often interested in the 
amplitude that the system is in a final state g(x). In 
this case, we can define a sink function G and a corre- 
sponding free particle solution g=iG(iV—m)— (where 
we write the adjoint so that it will correspond to the 
solution g(1)=—7(zV—m’)“G(1) corresponding to m’ 
having the opposite sign of the imaginary part to m). 
The matrix element to go from f to g is then the space- 
time integral of 


IGGV — B—m)“'F. (58) 
The expansion (56) gives for this element 
—igBf—igBUV~—m)“ Bf 
—igBUV—m)* BUV—m)IBf—-++ (59) 


N For £<to, the y from Eq. (52) would be zero and would not be 
the solution desired ; it can be obtained only from F witha different 
definition of the poles of the reciprocal operator. We assume we 
are only interested in the solutions in regions of space-time later 
than the time the “tnitial” electron wave functions are specified 
and earlier than the “initial” positron function is given. Since we 
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{assuming g, f are orthogonal states, /g(1)F(1)d71=0 
so the leading term vanishes). In space representation 
the first two terms of this are I, Eq. (22), and I, Eq. 
(23); in momentum representation the second term is 
I, Eq. (35). 

If more than one real charge is present without 
interaction, there is an operator (tV— B—®m) ' for each 
charge, operating exclusively on the space and spinor 
coordinates of that charge. Operators corresponding to 
distinct charges commute. Matrix elements are taken 
in the antisymmetric way described in I, Sec. 4, for 
accord with the exclusion principle. 

The contribution from closed loops is a factor 
C,=exp(—L), where L is not very easily defined 
directly in operators. But the first-order change on 
changing the potential from B to B+ AB is 


AL=trace[{(iV— B—m)1—(iV—m)“}AB], (60) 


where the “trace” means the diagonal integral in coor- 
dinates and the “Sp” on the spinor indices, in space- 
time representation just I, Eq. (29). 

This completes our summary in terms of operators of 
the results given in paper I on the theory of positrons. 
The main point is that aside from the problems of 
closed loops, one is merely analyzing by various tech- 
niques the consequences of Eq. (52) and, therefore, in 
general, the properties of the operator 


GV—B—m)-. (61) 


We may now turn to the quantum electrodynamics 
of such a particle, or system of particles. For simplicity 
we may restrict ourselves to the case of all virtual pho- 
tons. The real photon case can, of course, as always be 
obtained by considering also the effects of external po- 
tentials. For simplicity further assume, at first, zero ex- 
ternal potential. Our central problem, then, is the calcu- 
lation of the matrix element 


R=((iv— A—m)-)y (62) 


of (.V— A—m)™ between states of the field empty of 
photons initially and finally. Here A= 7,A, and A,(1) 
is the operator A,*{1)+A,7(1) acting on the field coor- 
dinates and satisfying commutation rules (46). This 
problem is relatively hard to solve directly. We do have 
the matrix element of any exponential form in A, in 
(43); but with A, involved in a reciprocal, it is another 
matter. We shall represent the reciprocal as a super- 
position of exponentials in the next section. 

From Eq. (62) we can derive in a simple direct manner 
the perturbation series results of II. For we know'® that 


(Au(1))0=0, 0(Ay(1)A,(2))0= 15d4(S12%), etc. (63) 


will take matrix elements between two states, this represents no 
real limitation (see I). 

8 The order of the A, operators in Eq. (63) is according to the 
time convention. If we put A=A*t+A™~ and use Eq. (46) to 
rearrange factors, they are evident, since A,* on the initial (and 
A,” into the final) photon-free state vanishes. They are identical 
to the lagrangian relations III, Eq. (52), and form the basis of 
Dyson’s description. 
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Hence, if we expand the reciprocal in power series as 
Eq. (56), in coordinate representation (57), with B 
replaced by A (or by A+ Bif an unquantized potential 
Bis present along with A), we may readily write down 
the zero-zero matrix element of each term. For example, 
third term of Eq. (57) gives the self-energy contribution 
in accord with II, Eq. (6). The other results of II are 
just further consequences of this series expansion as we 
have emphasized in TIT. 


8. USE OF A FIFTH PARAMETER IN 
DIRAC’S EQUATION 


In this section we discuss the representation of 
the reciprocal operator in exponential form. Since 
Jo” expiWx)dW =1i/x {or rather limeot/{x+7)), we 


we may write 


i) 


iGV~ B—m)-'= f exp[i(iV— B—m)W]dW, (64) 


0 


the definition of the singularity (as the limit with m 
having an infinitesimal negative imaginary part) being 
automatically represented. (See, however, the remarks 
at the end of this section.) We can also write this in the 
ordered operator form, 


~ Ww 
f exo {1V(w) — Biw)} | exp(—imW)dW, (65) 
0 0 


where we have written Bw) for y,(w)B,(x,(w)), where 
x,(w) are the four (v=1 to 4) coordinate operators, of 
which B, is a function, ordered by the ordering param- 
eter w and y,(w) are the four Dirac matrices similarly 
ordered. Likewise, we have :V(w)= y,(w)i0,(w), where 
id,(w) are the four ordered momentum operators con- 
jugate to x,. 

In this form B can be replaced by A and the expec- 
tation value for virtual transitions can be taken. This 
is done in the next section. We continue here with a 
more complete discussion of Eq. (65). 

The perturbation expansion in B of Eq. (65) should 
lead, of course, to Eq. (56). For example, the term first 
order in B in Eq. (65) is evidently 


~if exp(tmW) 


0 
xexp| if 
0 


Now the range of the w’ may be divided into two regions 
7 the quantities reordered (just as in Eqs. (5) and 
6)) to 


Ww 


iveu'yaw’| f Biw)dwdW. (66) 


~4 f f dW dw exp[i(W —w) GV —m)]B 
XexpliwV—m)], 


229 


121 


so that changing the order of integration, one finds 
immediately from 


oo) 


f areGv Ww) lasivom= 


0 


the result —2(7V — m)“ BUiV — m)—! as required. 

The contribution L of a closed loop can be written 
directly in exponential form. It is easily shown from 
Eqs. (60) and (65) that 


~ w 
i= f trace exp] f Gu) — Bw) |u| 


Xexp(—ilV). (67) 
(The second term in Eq. (60) actually has zero trace 
and was added only to make convergence problems 
appear less difficult. It has been omitted in writing Eq. 
(67). Also, the value of Eq. (67) when B=0O may be 
subtracted away, if desired, for a constant addition on 
L changes only the normalization of all probabilties.) 
Incidentally, the method of rendering this expression 
convergent (see IT, Sec. 7) for further calculations, is to 
call its value for mass m, L({m?) and then to calculate 


LP= feo —L(m?+2*) ]G(r)dd, 


where 
f G(A)dA=1 and f NG(A)dA = 0 
0 0 


and to assume that L? is to be used as the correct value 
of L in place of L{m?). This is equivalent to replacing 
the factor exp(—imW) in the integrand of Eq. (67) by 
another function F(W), where 


F(W)= J [exp(—imW)—exp{—2(m?+ ?)}W} ] 
° XG(A)dd. 


For large W this approaches exp(—imW), but for 
small W it falls off, the real part of it, at least, varying 
as W4, This renders Eq. (67) convergent. (The imaginary 
part of F(W) does not seem to lead to momentum space 
integrals whose convergence would be in question.) This 
suggests a general method of maintaining convergence; 
by keeping processes corresponding to small intervals 
of w from occurring with large amplitude. This is briefly 
discussed in III in reference 22. What is said there 
applies qualitatively as well to the Dirac case analyzed 
here, with « replaced by w. 

If there are several charges in the system, we must 
associate a separate w, for each, say w, for the nth. 
Each must have its own set of matrices y,‘) and coor- 
dinates x (y’s for different charges commute). If we 


call 
BO =, (wy) Bole), 
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the total matrix for all N particles is 


N wo 
il exp 


n=l Jog 


Wn 
lif [iV™ (wp) — B™ (w,) ldw,, 


Xexp(—imWn)dWn. (68) 
If in addition there are present a number of closed 
loops, the corresponding number of factors L must be 
multiplied in. 

One might try to give a kind of physical or, rather, 
mathematical view by which the form of Eq. (65) can 
be appreciated, in the following manner: 

We may deal with the Dirac equation somewhat in 
analogy to the method used in the discussion of the 
Klein-Gordon equation in the Appendix A of III. 
Consider a fifth variable w in addition to the four x, 
and that we have a wave function ¢(x, w), which is to 
satisfy 


—106/dw= (iV— B)¢. (69) 


Then since the potentials B,(x) are independent of w, 
the equation is separable in w, so that (x, w) 
=exp(imw)(x) is a solution of Eq. (69), if ¥(x) is a 
solution of the Dirac Eq. (50). Also, if we have any 
special solution of Eq. (69), ¢(x, w), we may obtain a 
solution of Eq. (50) by finding 


(x) = f (x, w) exp(—imw)dw. (70) 
Hence, by studying Eq. (69), we are at the same time 
studying the Dirac equation. 

Given the wave function ¢(x, 0) for w=0, the wave 
function at w= W is given by 


(x, W)=0(W) 9x, 0), (71) 
where the operator 6(W) is 
w 

6(W) =expi f [iv(w)— Bw) Jew (72 


for W>0 and, for convenience,” we take @=0 for 
W <0. The important operator for the Dirac equation, 
in view of Eqs. (70) and (71), is 

f AW) exp(—imW) dW, 

which is just Eq. (65) 

This interpretation suffers from a difficulty, however. 
For a free particle the operator 6(W) in momentum 
space is 6(W)=exp(Wp)=cos(W )+i(p/p) sin(gW), 
where p= (p”)}. The integral of this times exp(—imW) 
is really not always defined, even if m has a small 
negative imaginary part, for in intermediate states ~? 
may be negative and # imaginary, so that @ contains 
positive exponentials in W and the integrand is oscil- 
lating with ever increasing amplitude. We therefore 
look at Eq. (64) as a formal definition of the value of 


FEYNMAN 


the integral in all cases. Although this is satisfactory in 
a formal way for operators, it means that our inter- 
pretation cannot be taken literally. For example, we 
cannot obtain an unambiguous integral representation 
of 6{W) in coordinate space, for the requisite integral 
JS exp(—ipW) exp(—ip-x)d‘p is undefined. This is 
because it is probably not possible to obtain the wave 
function (71) at any value of W from that at W=0 
from Eq. (69) without further definitions. At least, the 
corresponding second-order equation (0°¢/df)— Vo 
—0°¢/dW?=0 is apparently not of the kind for which 
this type of Huygens principle applies. 

An alternative method of parametrizing the equation 
which does not seem to suffer from this interpretational 
difficulty is given in Appendix D. It leads, however, to 
more complicated {although algebraically equivalent) 
expressions for matrix elements than does Eq. (64). 


9. DIRAC ELECTRONS IN QUANTUM 
ELECTRODYNAMICS 


Returning now to quantum electrodynamics, for a 
single charge we want the expectation between photon 
free states of R in Eq. (62), This by Eq. (65) is the 
integral over all positive W of exp(—imW) times 


Kei f [1V (w) — A(w) dw) 


This is just exp[z/y™” 1V(w)dw_] times (the o( )o refers 
to the photon states, that is, affects A, only) 


{on-i J d ole) Ayla) ao) , 


which is of the form 


Kep-i f AyinDarr) 


of Eq. (42) with 
w 


jd)= [-relw)o*(a(u) and 


0 


(73) 


where x,; is the field point at which j, is calculated and 
5*(%,2—%1) means 6(2, 1). Thus, we may find the ex- 
pectation value with the relation (43). With this value 
of j, substituted on the right side of Eq. (43), the 6¢ 
functions are immediately integrable, and we find 
finally 


R= J “ea - i) “ viw)aw 


Ww Ww 
exp] —Hi! ff voterrace 
a o 


x isle daw” | exp(—imW)dW, (74) 
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in which we have written s*, for Lx,(w’)—x,(w"’) ] 
 [au(w") — x(w"’) J. 

This expression then contains a description of a Dirac 
electron interacting with itself. If an extra factor 
exp(—iso™ B(w)dw) is included, it describes such an 
electron also in an external potential. The terms may 
be expanded in powers of B and e”, and each term may 
then be simplified in the way we have described many 
tines before, for example, in connection with Eq, (66). 

When several charges are present, the result from 
Fq. (68) is the integral" 


[ff exlez fe [iV (w,)—m]Jdw, 


Wr Win 
xen HEL f f nya” (wa)! (205) 
0 0 


oN “ ‘di. (75) 


The contributions from closed loops may be obtained 
from this by choosing some value of 7, say, n=i, to 
represent matrices applying to a loop, dividing under 
the integral sign by IV,, and taking the trace with 
respect to the variables 7. 

The various present-day meson theories of nuclear 
interaction may be set up in quite analogous ways. For 
example, a nucleon interacting with itself through the 
agency of neutral pseudoscalar mesons with pseudo- 
scalar coupling is evidently described by Eq. (74); but 
with » replaced by the proton mass, and the interaction 
tern: altered by the replacement of e® by g’, y, by 7s, 
aud 6,(s?) by 47.(s°), the appropriate propagation 
function for mesons of mass w (7, is defined in I, Eq. 
(32), but =). Charged mesons may be represented 
by the use of isotopic spin operators also ordered by w. 


10. SUMMARY OF NUMERICAL FACTORS FOR 
TRANSITION PROBABILITIES 


The exact values of the numerical factors appearing 
in the rules of II for computing transition probabilities 
are not clearly stated there, so we give a brief summary 
here, 

The probability of transition per second from an 
ihitial state of energy E to a final state of the same total 
energy (assumed to be in a continuum) is given by 


'* This equation with its interpretation was proposed as a formu- 
lation of the laws of quantum electrodynamics (for virtual pho- 
tons) by the author at the Pocono Conference of Theoretical 
Physics (1948). The notation for ordering operators was explained 
there, However, at this time, the author had no complete formal 
derivation of Eq, (75) from the conventional electrodynamics, nor 
did he know of a satisfactory method of dealing with the closed 
loop divergences. 

"In I and II the unfortunate convention was made that d*k 
Means dk dk dkedk3(2r) for momentum space integrals, The 
confusing factor (27r)~? here serves no useful purpose, so the con- 


vention will be abandoned, In this section dk hasits usual meaning, 
dk dk idkodky, 
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(i=c=2), 
Prob. trans/sec = 2:V—|9M|*p(E), 


where p(£) is the density of final states per unit energy 
range at energy E and |S1|? is the square of the matrix 
element taken between the initial and final state of the 
transition matrix SW appropriate to the problem, N is a 
normalizing constant. For bound states conventionally 
normalized it is 1, For free particle states it is a product 
of a factor N; for each particle in the initial and for 
each in the final energy state. NV; depends on the 
normalization of the wave functions of the particles 
(photons are considered as particles) which is used in 
computing the matrix element of 9M. The simplest rule 
(which does not destroy the apparent covariance of 
ML), is? V;=2e;, where e; is the energy of the particle. 
This corresponds to choosing in momentum space, plane 
waves for photons of unit vector potential, e?=—1, 
For electrons it corresponds to using (au) =2m (so that, 
for example, if an electron is deviated from initial p, to 
final ~2, the sum over all initial and final spin states of 
[ow]? is Sp[(d2+m)oN(p.4-m)oT]). Choice of norma- 
lization (dy #)=1 results in ’;=1 for electrons, The 
matrix SW is evaluated by making the diagrams and 
following the rules of II, but with the following defini- 
tion of numerical factors. (We give them here for the 
special case that the initial, final, and intermediate 
states consist of free particles. The momentum space 
representation is then most convenient.) 

First, write down the matrix directly without 
numerical factors. Thus, electron propagation factor 
is (g—m)—, virtual photon factor is &~? with couplings 
Yu'**Yu. A real photon of polarization vector e, con- 
tributes factor e. A potential (times the electron charge, 
e) A,(x) contributes momentum q with amplitude a(q), 
where a,(q)= JS A,(1) exp(ig-a)d‘x,;. (Note: On this 
point we deviate from the definition of @ in I which is 
there (21)~* times as large.) A spur is taken on the 
matrices of a closed loop. Because of the Pauli principle 
the sign is altered on contributions corresponding to an 
exchange of electron identity, and for each closed loop. 
One multiplies by (27)~‘d‘p=(22)—‘*dp dp.dp,dp, and 
integrates over all values of any undetermined mo- 
mentum variable ~. (Note: On this point we again 
differ,?°) 

The correct numerical value of SW is then obtained 
by multiplication by the following factors. (1) A factor 
(4m)4e for each coupling of an electron to a photon. 
Thus, a virtual photon, having two such couplings, 
contributes 47e?. (In the units here, e?=1/137 approxi- 
mately and (4m)#e is just the charge on an electron in 
heaviside units.) (2) A further factor —7 for each virtual 
photon, 

For meson theories the changes discussed in II, 
Sec. 10 are made in writing ON, then further factors are 


21Tn general, N; is the particle density. It is Ni= (aya) for 
spin one-half fields and ils"a¢/at) 600+ /2¢] for scalar fields. 
‘The latter is 2e; if the field amplitude ¢ is taken as unity. 
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(1) (4)#g for each meson-nucleon coupling and (2) a 
factor —2 for each virtual spin one meson, but +2 for 
each virtual spin zero meson. 

This suffices for transition probabilities, in which 
only the absolute square of 3% is required. To get 
to be the actual phase shift per unit volume and time, 
additional factors of 7 for each virtual electron propa- 
gation, and —7 for each potential or photon interaction, 
are necessary. Then, for energy perturbation problems 
the energy shift is the expected value of iO for the 
unperturbed state in question divided by the normal- 
ization constant NV; belonging to each particle compris- 
ing the unperturbed state. 

The author has profited from discussions with 
M. Peshkin and L. Brown. 


APPENDIX 


In this Appendix (A, B, C) an attempt will be made to discuss 
some of the properties of ordered operators and of functionals in 
a somewhat more general way. 

Almost certainly many of the equations will be incorrect in 
their general form. This is especially true of those involving fourier 
transforms in function space. However, it is expected that they 
are correct in the special cases in which the formulas have been 
applied in the main part of the paper. Therefore, at least at first, 
when new results using these methods are derived, care should be 
taken to check the final result in some independent way. It is 
analogous to using power series expansions, or fourier transforms, 
in a calculation in a situation in which the conditions for the 
validity of the power expansions or of the transform have not been 
checked, or are not known to be satisfied. The physicist is very 
familiar with such a situation and usually satisfied with it, 
especially since he is confident that he can tell if the answer is 
physically reasonable. But mathematicians may be completely 
repelled by the liberties taken here. The liberties are taken not 
because the mathematical problems are considered unimportant. 
On the contrary, this appendix is written to encourage the study 
of these forms from a mathematical standpoint. In the meantime, 
just as a poet often has license from the rules of grammar and 
pronunciation, we should like to ask for ‘physicists’ license”? from 
the rules of mathematics in order to express what we wish to say 
in as simple a manner as possible. (These remarks do not apply 
to Appendix D.) 


A. Relation to Theory of Functionals 


In this section we would like to suggest how a general theory of 
ordered operators might be built up, and in particular, to point 
out certain relations to the theory of functionals. For clarity of 
exposition in this Sec. A, only, we represent all operators by 
bold-faced letters M and ordinary functions in regular type M. 
We have mentioned that with every functional F[M(s), N(s)+++] 
of the argument functions M(s), N(s) we wish to associate an 
operator (by identifying M(s) with an operator M(s) interpreting 
s as an ordering parameter with the operators M(s), N(s) sup- 
posedly known and with known commutation relations). The 
general theory of these associations might instead have begun by 
defining the meaning for the special case of the exponential func- 
tional exp fo! M(s)ds (we assume throughout this section, for 
convenience, that the range of s is 0 to 1), The corresponding 
operator 


(1-a) 


is defined as the value G(1) at s= 1 of that solution of the operator 
differential equation 


R= exp. M(s)ds 


dG(s)/ds = M(s)G(s), (2-a) 
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which is the identity operator at s=0, ie, G(O)=I. We have 
thereby defined the operators corresponding to more complex 
functionals such as F = exp fo'Lu(s)M(s) +7(s)N(s) +--+ ds, where 
u(s), »(s)++- are numerical functions and M, N arbitrary operators 
(which need not commute) as the G(1) from 


dG(s)/ds = [u(s)M(s)+(s)N(s) ++ > - JG(s) (3-a) 


with G(O)=I. For clearly u(s)M(s)+(s)N(s) +--+ can be con- 
sidered as a single operator function of s, the M(s) in Eqs. (1-a) 
and (2-a), 

Next we make the general definition that the operator to be 
associated with the sum of two or more functions i[M(s), V(s)- ++] 
+F,[M(s), N(s)--+]Jis the sum of the operators F;[M(s), N(s)-+ -] 
+F.[M(s), N(s)-- +] corresponding to each separately. 

Considering a derivative as the limit of a difference, we can use 
this idea of superposition to further extend the range of functionals 
for which operators are defined. As an example, in virtue of the 
fact that fot M(s)ds fot N(s)ds is the first derivative with respect 
to both yz, » of exp/o![uM(s)+vM(s)]ds evaluated at np, »==0 we 
may define the operator corresponding to fo! M(s)dsfo! N(s)ds 
as the corresponding derivative of the operator exp/o![uM(s) 
+vN(s) ]ds. Then from a study of the properties of the solution of 
Eq. (3-a) expanded in powers of », » we may readily verify that 
Jo! M(s)ds So! N(s)ds could also be evaluated directly by con- 
sidering s as an ordering index on the operators. 

Thus, the superposition rule permits a wide increase in the class 
of functionals for which we have defined operators. In fact, with 
some mathematical license, we have defined the operator for any 
functional. We wish to imagine that any functional can be repre- 
sented as a superposition of exponential ones in a manner analogous 
to the representation of an arbitrary function as a superposition 
of exponential functions. Thus, we expect to be able to write for 
any functional F[M(s)] (the true mathematical restrictions are 
completely unknown to me) 


FLM(i=f expfif? uoaras|stuOIDus), 4a) 


where S[u(s)] is a new (complex) functional, the functional 
transform of F[M(s)j, and f+++Dy(s) represents (some kind 
of an) integration over the space of functions y(s). For simplicity 
we take the case of just one argument function M(s), If FLM(s)] 
is given, & can be determined perhaps from 


TuiN]=f exp[ if, mom oas]Ata(o}OM@ Ga) 


with suitable normalization. Then, if & is known, we define the 
operator F[M(s)] as 


FIM)I=S exp[if moM@ds]SCu(9 IDK), 6a) 


where y(s) is a numerical function, Since we have already defined 
the operator exp[z/o! u(s)M(s)ds] (by Eq. (3-a) with » replaced 
by zu), we now simply require superposition of such operators for 
various y(s). The extension to functionals of several variables is 
evident. 

With these definitions of operators in terms of exponential func- 
tionals, the various theorems are easily proved. For example, the 
theorem (18) of Sec. 3 is first readily demonstrated for the special 
case that F is an exponential (1-a), Thus, to calculate 


R=exp f- P(s)ds exp f M(s)ds 


we must solve dG(s)/ds=[P(s)+M(s)]G(s). We try a solution 
G(s) = U(s)X(s), so that dG/ds = (dU/ds)X+ UdX/ds=PG 
+UdX/ds in virtue of Eq. (15). Thus, we have a solution if 
dX/ds= U4MG=U—MUX=M’X with M’ as in Eq, (16). Since 
G(0) = 1, if U(O)=1, we must have X(0)=1 so the solution of the 
X equation is exp fo! M’(s)ds in accordance with the definition 
(1-a), (2-a). (If U@)#1, replace X by XU-"(0) throughout.) 
Hence, Eq. (18) is established for exponential functionals. And 
since the theorem involves F linearly, it is therefore true for any 
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superposition of exponentials and hence for any functional which 
can be defined by means of such superposition. 


B. Momentum and Coordinate Operators 


In nonrelativistic quantum mechanics, without spin, all 
operators can be made up of coordinate operators and their con- 
jugate momentum operators. We show in this section how, at 
least in principle, all such operator functions can be disentangled. 

We can consider the case of one degree of freedom Q, and its 
momentum P, (When more variables are present, they present no 
new problem as variables corresponding to different independent 
coordinates commute.) Thus, we are to disentangle the general 
operator F[P(s), Q(s)] subject to the condition 

PQ-OP=~i. (7-a) 
This is the problem solved in this section. We can satisfy the com- 
mutation relation by putting P= —id/dQ (so that our solution 
may have applications outside quantum mechanics for the com- 
bination of operators X, d/dX are of frequent occurrence). Then 
the operator F can be defined by giving the function g of Q resulting 


from 
g(Q)= FLP(s), Q(s) I/(Q) (8-a) 


for arbitrary functions f(Q), where Q(s) and P(s) are interpreted 
as multiplication by Q, and —i times differentiation with respect 
to Qin the order defined by F. 

To obtain the relation of g, f suppose the P-dependence of F 
can be expanded as a functional transform, (p(s), ¢(s), v(s) are 
numerical functions) 


MH aOI=S exp] if p(s) 45] Sais), IDL), (9-2) 
where § is a functional of v(s) and of g(s). Now to evaluate the 
operator 
pf 
exp| - if, Poyw(syds| FLOEs), v(s)] 
we use our theorem (20) to disentangle the P(s) operator. We use 


a(s)=—iv(s) in Eq. (20), calling y(s)= Sot o(s’)ds’, so that Eq. 
(10-a) is 


(10-a) 


eV PM FLO(s), v(s)], (11-a) 


where Q’(s)=et¥@)?Q,e~/v)P_ As is well known from Taylor’s 
theorem, the operator e4/47 displaces x by 4 so that™® 


Q's) = Qs +95). (12-a) 
Substitution into Eq. (11-a) finds all the Q, preceding the P(1) 
so the operators are disentangled and Q. may be written simply 
Qo, whence we have 


FPG), O9)I= f exp| -iP:f" wos] 
x5[ otf o(s)ds’, 2(5) | Dai, (13-a) 


which in principle, at least, solves the problenn. 
We can goa bit further and assume Eq. (9-a) can be inverted as 


Flas), I= S expli fl rorsras |r), «ID, +a) 


where p(s) is a numerical function for transforming F. Also, 
¥(s) = Sot v(s)ds is as good a function as v(s) for purposes of 
(ntegration,?3 and we may write, substituting Eq. (14-a) into Eq. 


Or, differentiating Q’(s) with respect to s, find dQ‘(s)/ds 
= tev P(PQ—OP)e~¥(*)Pdy/ds. If we use the commutation rela- 
tion, this is dy/ds, whence Q’(s) differs from y(s) by the constant 
hoi Q, the evident value of Q’ for y(s)=0, establishing Eq. 

-a). 

3 For, if Du(s) be considered as the limit as A-70 of an integra- 
tion over all the variables v;=v(s;) with siz:—s;=A, then the 
change is from the variables v:=(yi41—y:)4—. Integration over 
¥: for all 2>0 is equivalent to integration on all 2;. (Since 
dv;= O"'dy:41, the jacobian of the transformation is A~“/4), which 


(13-a), 


FPO, OI= ff exp -iPoin tif" oaas| 


XFL[P(s), Oot) ID(s) Dy(s), (15-a) 
the integral extending over all p(s), and all y(s) subject to y(0)=0. 
Considering P as —id/dQ, the operator F[P(s), Q(s)] may be 
considered to operate on a function {(Q) to produce another 
function of Q. In particular, we are often interested in quantum 
mechanics in the projection of this final function into a given 
“final state” function g; that is, F is often defined through its 
matrix element 


(F=f s*(OFLP(), CI NQ)O. 


If we substitute into this expression (15-a), the P: can be con- 
sidered to act entirely on g*(Q) and since exp(+zyP)g(Q) = (0+), 
we find 


(N= f ert exp[ if pias] PLO, Qo 310] 
x Dp) Dy(e)- (odds. 


Define g(s) as the numerical function g(s)=Qe+3(s) and write 
finally (go= Qo) 


eEN= Sf adele f rasa rtp, (91 


X Dos) Dgs) fgoi@gudgr, (16-a) 


where the integral Dp(s) is over all p(s) and the integral Dg(s) 
is over all trajectories g(s) which go between the initial position 
qo and the final one q,, the final integration on dq: being repre- 
sented explicitly. This represents a complete reduction of an 
ordered operator F[P(s), Q(s)] involving conjugate operators P, 
Q (7-a)to a property of the corresponding numerical functional 
F(p(s), 9(s) 1, for in Eq. (16-a) p(s), g(s) are numerical functions 
so that all the operators have been eliminated. 

This is obviously related to the lagrangian form of quantum 
mechanics of C. In fact, for transitions, we are interested in the 
operator S= exp[—i fo? H(#)dt], where, for example, H = (1/2m) P? 
+V(Q,#). The matrix elenients of this, according to Eq. (16-a) 
are (use ¢ for s in range 0 to T) 


esn=ff e*(qr) expe p@aindt—-if” (1/2m) p(t)2dt 


if viato, oat] Kae) Doo DaOydanéae. 


The integral on p(#) is easily done. (See Appendix C for a more 
general discussion of gaussian integrals.) Substitute p(t) =mq 
+p’(t), so that 


f [(1/2m) p()?— pq Jat = (1/2m) f" prbedt—tm f" qd. 


Also, Dp(t) = Dp’ (d), since p and p’ differ by a constant at each ¢ 
(keeping q(¢) integration until later). The Dp’ (t) integral then 
separates out and integrates to some constant. Hence, within 
such a normalizing constant, the matrix element is 


ersn=f g*(qr) exp{ +i > Lamq(t}?— Vig(t), ox} 


X f(go) Dg(t)dgodgr.  (17-a) 


That is, the transition amplitude from point go at {=O to gr at 
t=T is the integral over all trajectories connecting these points 
of expz fo? LU¢(é), g(t) dt, L being the lagrangian for this problem. 
This is the fundamental theorem on which the interpretation of 
C is based. 

The fact that the nonrelativistic quantuin niechanical operators 
(other than spin) can be expressed in ternis of an integral over 


is only a change of normalization, and we are disregarding nor- 


malization factors.) 
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trajectories is based on the fact that the operators involved satisfy 
Eq. (7-a). If other operators are involved, such as Pauli’s spin 
operators @, or Dirac matrices y,, which satisfy different commuta- 
tion rules, a complete reduction eliminating all the operators is 
not nearly so easily affected. It is possible to eliminate the p 
operators in the Dirac or Pauli equation and get forms like Eq. 
(17-a), but the amplitude for a single trajectory is then a hyper- 
complex quantity in the algebra of the y, or 0. We give an example 
of this. 

Without disentangling the y, operators we shall disentangle 
the momentum operators p,=70/dx, from the operator 6(W’) 
of Eq. (72), which is a key operator in the analyses of the, Dirac 
equation. 

If we write 


a) =exp| i” ale) Pulao exp[ -1 Bi x(w). wlan], 


the », operators are already in exponential form and no fourier 
transforms are necessary. We may disentangle the p, in the first 
integral by a direct use of the theorem (20) with P(s) = pyu(w) 
(py for each value » is disentangled separately) and a(s)=77,(w). 
The resulting x,’(w) operator is x,(w)+JSo” yyu(w’)dw’ just as in 
Eq. (12-a) so that we obtain 


0(W’) =exp xu rtd] 


xexp| —af” B{x(0)+ f" yaw du’, wba]. (18-a) 


Here the x, and p, are completely separated, but the y, are 
thoroughly entangled. The w in B keeps track of the fact that 
the y, in its definition acts at the order w; thus, it is 
¥u(w) Byuley(O)+ So" y.(w’)dw"]. A similar separation may be 
made in the operator for self-action (74) which now is 


R= ui ” expLipy, oCu(W)] exp{ —hi f a f © Cu(w)Cu(w”) 
X54 {Cu(w")— Cu(to”)}*Jdw'dw"} exp(—imW)aW, (19-a) 


where one must substitute Cy(w)= So” yu(w’)dw’, Cyu(w) = yyw) 
(pu, refers to the momentum operator operating on the final 
state). All reference to space coordinates have disappeared. The 
problem of self-energy of an electron is reduced to the algebraic 
one of disentangling a combination of y, in an expression in 
which, however, they are almost hopelessly tangled up. Not much 
has been done with this expression. (It is suggestive that perhaps 
coordinates and the space-time they represent may in some future 
theory be replaced completely by an analysis of ordered quantities 
in some hypercomplex algebra). 

Since the spin operators are so simple and fundamental to 
quantum mechanics, they present some interesting unsolved 
problems. For example, if FLx(s), y(s), 2(s)] is a known functional 
of a three-space trajectory a(s), y(s), 2(s), evaluate in terms of 
this functional, the operator F[e2(s), oy(s), o2(s)], where the oz, 
oy, 0, are the anticommuting Pauli operators of unit square satis- 
fying o:0yo,=7. The corresponding problem with Dirac operators 
is a kind of four-dimensional generalization of this. Alternatively, 
since the Dirac operators can be represented as the outer product 
of two commuting sets of Pauli operators, the solution of the 
problem with Pauli operators could be directly extended to the 
Dirac case. The Pauli matrices (times 7) are the basis for the 
algebra of quaternions so that the solution of such problems 
might open up the possibility of a true infinitesimal calculus of 
quantities in the field of lypercomplex numbers. 


C. Gaussian Functionals 


In a large number of problems the operators appear in ex- 
ponentials only up to the second degree. For this reason it is handy 
to have available a formula for the integration of gaussian func- 
tionals. We can define a gaussian functional G[y(s)], of one 
function y(s), as one of the form, G[y(s)]=expzE[y(s)] with 
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E({y(s)} quadratic. Thus, we have 


PL =af J) AG, so@v(satds+ J" Bs)s(s)ds, 20-a) 


where A(t, s), B(s) are functions independent of y (that is, G is 
gaussian if the second functional derivative of InG is independent 
of y). Gaussian functionals of several variables are of frequent 
occurrence. All the quantum field theory hamiltonians and 
lagrangians are of this form in the field variables. A formula for 
the integral of G[y(s)] over all paths y(s) has been found useful. 
It will be developed here. Consider the integral (we suppose -4 (/, s) 
real, or at least has a positive definite imaginary part) 


ITA, B]= f exp{i£[y(s)]} Dy(s). 


It is a functional of A(é, s) and B(s). First, the dependence on B 
may be determined, as follows. 

Let 9(s) be that trajectory which makes the exponent E[y(s)] 
an extremum. That is, 9 is a solution of (assuming A symmetric) 


[, AG, )i(s)ds=— BO. 


Or, if Y be the reciprocal kernel to 4 (which can often most easily 
be found merely by solving Eq. (22-a)), 


(s)= -f' Nd, s)B()dt. (23-a) 


Then put y(s)=9(s)+x(s). (Note, Dy(s)= Dx(s).) In virtue of 
Eq. (22-a), one finds ELy)= ELGI+4S0 Soll (t, s)x(d)x(s)déds. 
Here, ELy(s)] can also be written explicitly as —3/o'/o' B(s) 
XV, s)B(t)déds, using Eq. (23-a), Substituting this into Eq. 
(21-a), we see a factor exp{iE[9(s)]}=GLg(s)] may be taken 
outside the integral, as it is independent of x(s). Hence, we have 


ICA, BJ=G[H(s) VLA], (24-a) 


(21-a) 


(22-a) 


where 


JA Jee [aiff Alt, seated Dx(s)=/04,0]  (25-a) 
does not depend on B, and 
G(G(s) ]=exp{iZ[y(s)]} 


=exp|—# Sf Bone, B¢natas | (26-4) 


Often this is as far as it is necessary to go, as the dependence of 
Z on B may have been all that is necessary to know, J[A] being 
a kind of normalizing factor that is not of importance or that can 
be obtained in some other manner. 

Having this form for /, we may obtain other integrals. For 
example, since 


BIA, BYV/SBE)=7f GLy(s)O Dy, 


this integral can be immediately evaluated by differentiation of 
the expression (24-a) for / with respect to B(é). Since 


8G(9(s) ]/5B (1) = iG 9s) JEE(H(s)) /SBO) 
= if" BONG, dds-Glg(s)}. 
we find 


f Gl8)Ix0 Dy(s)= gOILA, BY. (27-2) 
Differentiating a second time, since d9(¢) /SB(t’) = — N(, #’), one 
finds 


f LS) bse) Dos) = (HOC) +iNG, “VLA, BY, 


etc., for higher powers of y. 

Incidentally, this permits us to obtain the properties of J. For 
the left-hand side of (28-a) is also —273/(A, B]/5A (4, #’). In the 
special case B=0, we have 7=0 from (23-a), and since J(A) 
=/{4,0], we find 


3J/6A (L,Y) = —4N(E, £)J. 


(28-a) 


(29-a) 


AN OPERATOR CALCULUS 


This property of J determines it to within a numerical factor 
independent of A. 

We have used these theorems, or something like them, on 
various occasions. One example was the passage from Eq. (16-a) 
to Eq. (17-a). In more generality, put 


F= exp| -if A(P(s), a(s)as], 


where H is quadratic in p. The integrations on p(s) in Eq. (16-a) 
now represent an example of our theorem with y=p and E[p(s)] 
=JS[p¢-H(p, q) lds. The extremum requires 0H/dp=q. If the 
solution of this is ~, considered as a function of q, g, then the 
integral on p produces within unimportant factors an exponential 
of S[pa-H1(p, q) Jds, that is, if°Lds, where 1 is the lagrangian. 
This example shows that in our discussion, we have not been 
sufficiently rigorous mathematically, for the important problem 
of the order of noncommuting operators f, g, in the original 
definition of H does not seem to have arisen. 

A second example is the integration of exp(¢ /Ldt) when L is 
quadratic in 4, g. For the forced harmonic oscillator where 
L=}(P—w?)+7(Hq(t), the integral was carried out in III, 
footnote 7. The operator A(é,s) is —[(d*/d#)+w*]5(¢—s), the 
inverse V of which involves sines and cosines but is not unique. 
However, in this case boundary conditions exist at the end points 
q(0), q(T), and these boundary conditions determine WN and also 
restrict the range of y integration. The footnote serves a model of 
what to do under circumstances and will not be discussed further 
here. 

The problem of integrating 


exp[-if in(t)Ay(an| -exp| 3 (24,/ax,¥dr| (xe) 


over all distributions of field A,(1) required in III, Sec. VIII, 
serves as a further example. Here y(s) is replaced by A,(1), and 
B(s) by ju(1). The operator A(t, s) becomes (}:78(2, 1), the inverse 
of which is again not unique. The inverse V(t, s) required in III 
is 54(s1:?). (The boundary conditions required to define this par- 
ticular inverse are probably related to the condition that no 
photons are supplied in the past and none are wanted in the 
future, so that the inverse must have no positive frequencies for 
t-+— © and no negative ones to f+ ©.) Thus, E[9(s)] becomes 
the important quantity —} SW j,(1)ju(2)54(si12)dridre, so that 
Eq. (24-a) gives Eq. (43) or III, Eq. (48), which we had taken 
such pains in III to derive in a more rigorous manner. In none of 
these examples do we require J. 

A more complicated example is that of the analysis of the 
operators Corresponding to the electron-positron field given in I, 
Appendix. If electrons obeyed Bose statistics, the commutation 
rules would have been altered, the net effect being just to change 
a few signs in the final expressions. Analyzed as an Einstein-Bose 
field, however, the operators YW! can be considered as ordinary 
functions, and the lagrangian technique may be used. The problem 
then requires gaussian integrals (actually, integrals of exponentials 
of bilinear expressions, but these are as easy to work out). The y 
corresponds to W& (or W*), the AG, s) is related to the Dirac 
hamiltonian, and its inverse K ,(42(2, 1) replaces NV. The problem 
of determining C, corresponds to that of finding J[A]. The 
problem is complicated somewhat by the necessity of keeping the 
order of the y,-operators correctly. 

The relation of problems with operators obeying Fermt- Dirac 
Statistics and those with the same operators obeying Einstein- Bose 
commutation rules is very close. The results of the former in 
practical cases,*4 at least, may be obtained from the latter by 
simply altering some signs. The [instein-Bose case is very easily 
analyzed by ordered operator algebra (as in Sec. 7) or by the 


4 The only known practical case, of course, is the electron- 
positron field. Here the problem has been completely worked out. 
I seem to be affected by the disease so prevalent today in theo- 
retical physics, to delight in seeing a very general method of 
solving a problem, when actually in physics only one example of 
the type of problem exists and this has already been worked out. 
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lagrangian integral methods. The anticommuting operators seem 
at first sight more complicated; but this they cannot be, as the 
results are just as simple. It would seem worth while to develop 
the analysis of anticommuting operators in much more detail than 
has been given here. Presumably, good use can be made of the 
similarity to the Einstein-Bose case. The theorems developed in 
analysis of this problem may conceivably have application in the 
problem of disentangling Pauli spin operators. 


D. Fock’s Parameterization of the Dirac Equation 


We wish to call attention to an interesting alternative method of 
parameterizing the Dirac equation, suggested by Fock.‘ It, like 
that of Sec. 8, would also have permitted us to pass directly to 
the formulation of electrodynamic problems. It is more readily 
interpreted than that of Sec, 8.% 

As a consequence of the Dirac equation (iV— B)p=m-b, y also 
satisfies («V— B)(tV— B)y= my. Expanding the operator, this is 
equivalent to 

[8 /dx,)— By, Py— tow PF yy =niy, (30-a) 
with og=4i(vp¥r—Yr7y). This differs from the Klein-Gordon 
equation only through the addition of the term —4o,,Fy,, where 
F v= (0B,/dx,)—9B,/dx, is the field tensor. 

Just as in the Klein-Gordon case, ITI, Appendix A, this can be 
converted by the aid of a fifth parameter to Fock’s equation, 

10g/du=3((40/Axy)— By PO iow? wo, (31-a) 
for which the special solution = exp(— 3imu)y leads back to Eq. 
(30-a). It can then be analyzed by the lagrangian method. The 
final result is that the amplitude to go from one point to another 


(see III Eq. (5-a)]Jis the sum over all trajectories x,(u) of the hy- 
per complex amplitude 


exp{ -i LCR ax u/du)?+ dey /de)Bylx) 
_ iol) Foxe) auf, (32-a) 


the order of operation of the oy, being determined by the parameter 
u. This, in fact, is the lagrangian formulation of the Dirac equation 
suggested in C, Sec. XIV. 

A rotation (and Lorentz transformation) by angle wy,» in the 
uv-plane, is represented in Dirac theory by the operator 
exp(42wyre yy). (The summation on both » and » accounts for a 
factor 2.) Hence, we can say that Eq. (32-a) means that the am- 
plitude for arrival js exp(iS), where S is the classical action 
~S (4(dx,/du)?+B,dx,/dujdu, but the orientation represented 
by the hyperco:nplex amplitude has rotated at each point in its 
path at an angular velocity (per dz) equal to the field strength at 
that point. (Angular velocity in four dimensions is an antisym- 
metric tensor of second rank, as is the field strength.) 

Since the potentials appear in exponential form, this may be 
directly connected to the form representing the action of virtual 
photons. The result is a set of rules like that for the Klein-Gordon 
case, Sec. IX, but with an additional coupling Fyuyyyye. They may 
be shown to be algebraically equivalent to the rules usually given 
for the Dirac equation, but are somewhat more complicated and 
not very interesting. There are some properties of the Dirac 
electron, however, which are more obvious in this formulation 
than in the usual one, and these we will discuss. 

It is apparent from Eq. (32-a) that in the classical limit the 
trajectory is that of minimum S and therefore satisfies 

Px, /du?= (dx, /du)F yy. (33-a) 
(Hence, (dx,/du)?=(ds/du)? is a constant of the motion where s 
is the proper time, and the minimum action S is — }(ds/dz)* plus 
a term independent of «. Since this is to vary as — 4m?u, we find 
ds=mdu.) As h-0, the magnetic moment approaches zero and 
does not affect the trajectory. But since the intrinsic spin angular 
momentum also goes to zero, the rate of precession of spin has a 
Classical limit. For completeness we should also give the equation 


25 Y, Nambu, Prog. Theor. Phys. (Japan) 5, 82, 1950. 
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of motion of the spin axis (just as the spot on a billiard cue ball 
has a motion, although it does not affect the trajectory of the 
ball). From Eq. (32-a) we see the spin axis precesses at angular 
velocity Fy» (per du). (These are well-known results of the WKB 
approximation method when applied to the Dirac equation.) 
Since Eq. (33-a) says only that dx,/du precesses at the same an- 
gular velocity, we can summarize the classical equations of 
motion, and of spin precession for a Dirac electron as: The velocity 
vector and spin plane are fixed in a four-dimensional coordinate 
system turning at each instant at an angular velocity per unit proper 
time equal to e/m times the field strength acting on the electron at that 
instant. (For example, for a slowly moving electron in a magnetic 
field B the velocity vector revolves about the magnetic field as an 
axis at angular velocity w= (e/m)B, the cyclotron frequency. The 
spin does likewise precessing therefore at the same frequency, 
which is twice the Larmor frequency.) 

T have expended considerable effort to obtain an equally simple 
word description of the quantum mechanics of the Dirac equation. 
Very many modes of description have been found, but none are 
thoroughly satisfactory. For example, that of Eq. (32-a) is in- 
complete, even aside from the geometrical mysteries involved in 
the superposition of hypercomplex numbers. For in (32-a) the 
field enters in two apparently unrelated ways, once into defining S 
and again in the rotation rate. In the classical limit both effects 
of the field can be neatly stated in one principle. What makes 
things particularly simple in quantum mechanics if, for a diffusing 
wave, a rotation at rate Fy, is accompanied by a phase shift equal 
to the line integral of A ,??6 


If the joyFy term is considered to have a coefficient a 
analogous to a kind of anomalous magnetic moment, difficulties 
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In the case that the fields F,, are constant in space and time, 
the operator factor exp(}ivo,»F yy) is independent of the trajectory 
and factors out of Eq. (32-a). The remaining path integral is 
gaussian and can be carried out exactly (Appendix C), giving the 
results of Fock‘ and Nambu.* 

If the operator on the right-hand side of Eq. (31-a) ts considered 
as a type of hamiltonian, the rate of change with w of all the rele- 
vant physical quantities (given by the commutator with this 
operator) are very easily interpreted by classical analogy. 

There are, of course, twice as many solutions of Eq. (30-a) as 
solutions of the Dirac equation (50). (The others correspond to 
Eq. (50) with negative m.) If x is a solution of Eq. (30-a), the 
projected part y=(2m)—(iV— B+-m) x solves the Dirac equation 
(50). Projection operators must still be used, therefore, in cal- 
culating matrix elements if Eq. (30-a) in perturbation is used 
instead of the Dirac equation.?? 


arise in the resulting theory unless a=1 or a=0. Thus, the real 
part of L, in the amplitude for a vacuum to remain a vacuum, 
Cy=exp(—ZL), should always be positive if the theory is to be 
easily interpreted (see I, Sec. V). For general a, it seems that the 
real part of L is positive for some processes (or potentials), negative 
for others. It is always positive only if a=1. But for a=0 it is 
always negative, so we can reinterpret the theory in this case as 
referring to Bose particles, in which case C, should be exp(+L) 
(I, Sec. V). For a=0, Eq. (30-a) becomes the Klein-Gordon 
equation, of course. 

27 A convenient way to make the correspondence of solutions x 
of Eq. (30-a) and y of the Dirac equation unique is to assume x 
is also an eigenfunction of ys, that is, ysx=7x. This is possible, as 
Ys commutes with the operators of Eq. (30-a), Then, for each y, 
the corresponding x is x=(1~7ys)¥. 
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A variational principle is developed for the lowest energy of a system described by a path integral. It is 
applied to the problem of the interaction of an electron with a polarizable lattice, as idealized by Frvhlich. 
The motion of the electron, after the phonons of the lattice field are eliminated, is described as a path 
integral. The variational method applied to this gives an energy for all values of the coupling constant. 
It is at least as accurate as previously known results. The effective mass of the electron is also calculated, 


but the accuracy here is difficult to judge. 


N electron in an ionic crystal polarizes the lattice 
in its neighborhood. This interaction changes the 
energy of the electron. Furthermore, when the electron 
moves the polarization state must move with it. An 
electron moving with its accompanying distortion of 
the lattice has sometimes been called a polaron. It has 
an effective mass higher than that of the electron. We 
wish to compute the energy and effective mass of such 
an electron. A summary giving the present state of 
this problem has been given by Frohlich.' He makes 
simplifying assumptions, such that the crystal lattice 
acts much like a dielectric medium, and that all the 
important phonon waves have the same frequency. We 
will not discuss the validity of these assumptions here, 
but will consider the problem described by Fréhlich 
as simply a mathematical problem. Aside from its 
intrinsic interest, the problem is a much simplified 
analog of those which occur in the conventional meson 
theory when perturbation theory is inadequate. The 
method we shall use to solve the polaron problem is 
new, but the pseudoscalar symmetric meson field 
problems involve so many further complications that 
it cannot be directly applied there without further 
development. 
We shall show how the variational technique which 
is so successful in ordinary quantum mechanics can be 
extended to integrals over trajectories. 


STATEMENT OF THE PROBLEM 


With Frdhlich’s assumptions, the problem is reduced 
to that of finding the properties of the following 
Hamiltonian: 


1 
H=1P?4+ 3 x agtaxti(W2me/V)t DK XK 


X[axt exp(—iK- X)—ax exp(iK- X)]. (1) 


Here X is the vector position of the electron, P its 
conjugate momentum, @x*, ax the creation and annihi- 
lation operators of a phonon (of momentum K). The 
frequency of a phonon is taken to be independent of K. 
Our units are such that #, this frequency, and the 


1H. Frdhlich, Advances in Physics 3, 325 (1954). References to 
other work is given here. 


electron mass are unity. The quantity a acts as a 
coupling constant, which may be large or small. In 
conventional units it is given by 


1 (: ‘) e (~). 
a=-{ ——- J--[ —], 
2\e0 eFhw\ h 
where ¢, €. are the static and high frequency dielectric 
constant, respectively. In a typical case, such as NaCl, 
a may be about 5. The wave function of the system 
satisfies (A= 1) 
10p/dt= Hy, (2) 


so that if gy, and E, are the eigenfunctions and eigen- 
values of H, 


Hon=EnGn, (3) 


then any solution of (2) is of the form 
V=Ln Cane tent. 


Now we can cast (1) and (2) into the Lagrangian form 
of quantum mechanics and then eliminate the field 
oscillators (specializing to the case that all phonons are 
virtual). Doing this in exact analogy to quantum 
electrodynamics,” we find that we must study the sum 
over all trajectories X(t) of exp(zS’), where 


1 dX\? 
s=-{(=) dt 
2 dt 
4 Phei f f |X—X,[eelaids. 4) 


This sum will depend on the initial and final conditions 
and on the time interval T. Since it is a solution of the 
Schrédinger Eq. (2), considered as a function of T it 
will contain frequencies E,, the lowest of which we seek. 
It is difficult to isolate the lowest frequency, however. 

For that reason, consider the mathematical problem 
of solving 


dy /dt= — Hy, (5) 


without question as to the meaning of #. This has the 
same eigenvalues and eigenfunctions as (3), but 4 


2R. P. Feynman, Phys. Rev. 80, 440 (1950). 


660 


238 


SLOW ELECTRONS IN POLAR CRYSTAL 


solution will have the form 
v= Dn Cre Fon. 


For large ¢ any solution therefore asymptotically dies 
out exponentially, the last exponent surviving being 
that of the lowest EZ, say Eo. 

An equation such as (5) can be converted to a path 
integral just as easily as (2) is, and the integral over 
the oscillator coordinates can again be done in an 
analogous way. The Lagrangian form corresponding to 
(5) turns out to be 


K= f expSD X(t), (6) 


S= +f (ya 
2 dt 
traf f| X,— X, [Te "ldids. (7) 


This is just as one might expect from replacing ¢ in (4) 
by —it. Now, since K is a solution of (5), its asymptotic 
form for a large ¢ interval, 0 to T is 


K~we-2T (8) 
as T—. Therefore, we must estimate the path 


integral (6) for large T. 


VARIATIONAL PRINCIPLE 


The method we shall use is a type of variational 
method. Choose any S; which is simple and purports 
to be some sort of approximation to S. Then write 


f expSDX(t)= f exp(S—S:) expS:DX(1). (9) 


Now this last expression can be looked upon as the 
average of exp(S—S), the average being taken with 
positive weight expSi. But for any set of real quantities 
f the average of expf exceeds the exponential of the 


average, 

(exp) 2 exp(/). 
Hence if in (9) we replace S—S; by its average, 
(S—S1) 


(10) 


= f (S—S1) expS,:DX (1) Vs f expSiDX(1), (11) 


we will underestimate the value of (9). Therefore, if 
E is computed from 


f en@-syy expSi:\DX(t)~exp— ET, (12) 
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then we know that E exceeds the true Eo, 


E> Ep. (13) 


If there are any free parameters in S; we can choose 
as the “best” values those which minimize E. 

Since (S—S,) defined in (11) is proportional to 7, 
let us write 


(S—S,)=sT. (14) 


Furthermore, the factor exp(S—S;) in (12) is constant, 
of course, and may be taken outside the integral. 
Finally, suppose the lowest energy E, for the action S, 
is known, 


f expS1D X(#)~exp(— E,T), (15) 


then we have 
E=E\-s (16) 


from (12), with s given by (11) and (14). (In the case 
that S and S; are both simple actions [of the form of 
(18) below] this can readily be shown to be equivalent 
to the usual variational principle.) 


POSSIBLE TRIAL ACTIONS 


Some of the methods which have been applied to 
this problem, so far, correspond to various choices 
for S:. The perturbation method corresponds to 
Si=—4/S (dX/dt)*dt and gives 


E=—a. (17) 


We see immediately that the perturbation result is an 
upper limit to Zo, a result proven only with much 
greater effort by more usual methods, by Gurari® and 
Lee and Pines.‘ Another suggestion is 


Si=—3 f @Xvantart fv(Xod, (18) 


where V is a potential to be chosen. If a Coulomb 
potential is chosen, V(R)=Z/R, and the parameter Z 
varied, one finds 


E= — (25/256)a? = — 0.09802 


asymptotically for the case that @ is very large. For 
large a this corresponds to Landau’s method! with a 
trial function of the form e~*r. If a harmonic potential 
V(R)=kR? is used (corresponding to a Gaussian trial 
function in Landau’s method) the value is somewhat 
impreved: 

(19) 


If @ is uot so large, the form (18) can still be used in 
(16). The evaluation of s requires knowledge of the 
eigenfunctions and eigenvalues for the potential V 


= — (1/3m)o2= —0.10602. 


3M. Gurari, Phil. Mag. 44, 329 (1953). 
4T. Lee and D. Pines, Phys. Rev. 88, 960 (1952). Lee, Low, 
and Pines, Phys. Rev. 90, 297 (1953). 
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The result is somewhat difficult to evaluate for the 
Coulomb potential, but fairly simple for the harmonic 
case [see (34) below]. However, it is readily shown 
that for any a@ less than about 6 no choice of V can 
improve the result (17) for V=0. Fréhlich has asked 
for a method which works uniformly over the entire 
range of a. He points out that the artificial binding to 
a special origin, which (18) implies, is a disadvantage. 
It is this which presumably makes any potential V 
give a poorer result than V=0 for small a. 

To remedy this, I thought a good idea would be to 
use for S; the action for a particle bound by a potential 
V(X—Y) to another particle of coordinate Y. This 
latter could have finite mass, so no permanent origin 
would be assumed. Of course the action for such a 
system would contain both X(#) and Y(#). But the 
variables Y(f) could be integrated out, at least in 
principle, leaving an effective S; depending only on X. 
At first I tried a Coulomb interaction for V(X— Y) 
but it was rather complicated. The technique may be 
useful in more difficult problems. But here we have 
already seen that an harmonic binding should be as 
good, if not better. Further, an extra particle bound 
harmonically has its variables Y(f) appearing quad- 
ratically in the action. It may therefore be easily 
eliminated explicitly. The result we know from studies 
of similar problems in electrodynamics. We are, in this 
way, led to consider the choice 


sa-af (G)aef foxea2 


Xexp(—w|t—s|)dids, (20) 


where C and w are parameters, to be chosen later to 
minimize E. 


EVALUATION OF THE ENERGY 


Since S$; contains X only quadratically, all the 
necessary path integrals are easily done. Because the 
method may not be familiar we outline it briefly here. 
Define the symbol (_ ) as 


(F)= f F exp5:DX(1) ri f exp5:D X(t). 


Then comparison of S; and S shows that 


_1 _ 4 —l\o-| ta} 
s=—{5-Si)=2% {( X.— X,[-Yetelas 


+3C f ((X.— X,Pewllds= A+B. (21) 


*R. P. Feynman, Phys, Rev. 84, 108 (1951), Appendix C. 
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We concentrate first on the first term A of (21). In 
it we may express | X,— X,|~1 by a Fourier transform, 


|X,— X,[t= f @#K exp[iK-(X,— X,)](QmK2)-". (23) 


For this reason we need to study 


(exp{iK -(X,— X.) ]) 
= f exp: exp(iK- (X,— X.)]oX() i 
f expsioxo. (23) 


The integral in the numerator is of the form 


i fof Q)artef fox 


Kewl t-sldids+ f f(t). x(a] Dx0, (24) 
where specifically 
f£() =iKé(t— 1) —7Ké(t—o). (25) 


Now we shall find (24) insofar as it depends on f or K 
aside from a normalization factor which drops out in 
(23). Incidentally let us notice that the three rec- 
tangular components separate in (24) and we need 
only consider a scalar case. The method of integration 
is to substitute X(=X'()+YV (4), where X’(/) is that 
special function for which the exponent is maximum. 
The variable of integration is now Y(). Since the 
exponent is quadratic in X(#) and X’ renders it an 
extremum, it can contain Y(t) only quadratically. 
Evidently Y then separates off as a factor not containing 
f, which may be integrated to give an unimportant 
constant (depends on T only). Therefore within such 
a constant 


I=exp| -3 f X?dt—4C a f (X'— X,')? 


xewndast f foxeai), (26) 


tol 


where X’ is that function which minimizes the expres- 
sion [subject for convenience, to X’(0)=X'(T)=0 if 
the time interval is 0 to T]. The variation problem 
gives the integral equation 


@X"(t) /dP=2C ii (X/—X,elHlds— f(). (27) 
Using (27), (26) can be simplified to 


I= exp fix’ coat}, 


(28) 
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We need merely solve (27) and substitute into (28). 
To do this we define 


w 
Zit) =— fem tad X,'ds, 

2 

so that 
PZ (1) /dP=w{Z()-—X'()], 

while (27) is 

4c 

@X'(t)/dP=—[X'(t)—Z(t)]— f(. 

w 

The equations are readily separated and solved. The 


solution for X’(#) substituted into (28) gives, for the 
case (25), 


I=(exp[:K- (X,— X.) ]) 


2CK? w 
= exp| - (=e) ~<a] el | (29) 
vw 2v? 
where we have made the substitution 
P= wt (4C/w). (30) 


The result is correctly normalized since it is valid for 
K=0. The integral on K in (22) is a simple Gaussian, 
so that substitution into A gives 


v 


ret v—y? —} 
Amrtav f [wr d-e] e-tdr. (31) 
0 


To find B we need ((X.— X,)*). This can be obtained 
by expanding both sides of (29) with respect to K up 
to order K*. Therefore 


4C w 
3((X,— Xo)?)=—— (Ses) Fats j 
ww v 


The integral in B is now easily performed and the 


expression simplifies to 
B=3C/w. (32) 


Finally we need Ei, the energy belonging to our action 
S;. This is most easily obtained by differentiating both 
sides of (15) with respect to C. One finds immediately 


CdE,/dC = B, 
so that, in view of (32) and (30), integration gives 
1=$(v—w), 


since E,;=0 for C=0. Since Ei— B= (3/42) (v—w)? we 
obtain finally for our energy expression: 


B=—(—wy—A, (33) 
4v 


with A given in (31). The quantities », w can be con- 
sidered as two parameters which may be varied sepa- 
rately to obtain a minimum. 
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The integral in A unfortunately cannot be performed 
in closed form, so that a complete determination of E 
requires numerical integration. It is, however, possible 
to obtain approximate expressions in various limiting 
cases. The case of large a corresponds to large v, The 
choice w=0 leads to an integral 


a (1/2) 
vil ($+1/0)’ 


and E,=3v/4. It corresponds to the use of a fixed 
harmonic binding potential in (18). For large v, e~°" 
can be neglected, so that A=a~tov!. This corresponds 
to using a Gaussian trial function in Landau’s method. 
For a less than 5.8 and w=0, (33) does not give a mini- 
mum unless »=0, so that the w=0 case does not give a 
single expression for all ranges of a. In spite of this 
disadvantage the result with (34) is relatively simple 
and fairly accurate. For a>6, only fairly large v are 
important, and the asymptotic formula (good to 
1 percent for v>4), 


A=a(v/m)i[1+ (2 In2)/v], 


is convenient. Fréhlich, however, considers the discon- 
tinuity at a=6 as a Serious disadvantage, which it is the 
purpose of this paper to avoid. This we do by choosing 
w different from zero. 

Let us study (33), just for small a, in case w is not 
zero. The minimum will occur for » near w. Therefore 
write v=(1+e)w, consider ¢ small, and expand the 
root in (31). This gives 


A= teat f e~tdr[1—e* J t= (34) 
0 


A =a(0/x)| 1- of rena e~”')dr/wri+--- } 


The integral is 


2w[(1+w)!—1]= P. (35) 


The problem (33) then corresponds, in this order, to 
minimizing 
E=3wé-—a—ae(1—P). 
That is, 
e= 2a(1— P)/3w, 


which is valid for small only, as « was assumed small. 
The resulting energy is 
= —a—o?(1— P)?/3w. 


Our method therefore gives a correction even for small 
a. It is least for w=3, in which case it gives 


E=—a—o?/81= —a—1.23(a/10)*. (36) 


It is not sensitive to the choice of w. For example, 
for w=1 the 1.23 falls only to 0.98. The method of 
Lee and Pines® gives exactly this result (36) to this 
order. The perturbation expansion has been carried to 


6 T. Lee and D. Pines, Phys. Rev. 92, 883 (1953). 
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second order by Haga? who shows that the exact 
coefficient of the (a/10)? term should be 1.26, so that 
our variational method is remarkably accurate for 
small a. 

The opposite extreme of large a corresponds to large 
v, and, as we shall see, w near 1. Since »>>w the integral 
(31) reduces in the first approximation to (34), which 
We can use in its asymptotic form. The next approxi- 
mation in w can be obtained by expanding the radical 
in (31), considering w/v<1. Furthermore, e~*? is 
negligible. In this way we get 


3 : P 21n2 w ; 
Bae) —a(v/r) (14-5). (37) 


This is minimum, within our approximation of large 2, 
when w= 1, and v= (402/97) — (4 1n2—1): 


E=—o?/3r—3 In2—3= —0.106c?—2.83. (38) 


The approximations do not keep E as an upper limit as, 
unfortunately, the further terms, of order 1/a? are 
probably positive. 

For further numerical work it is probably sufficiently 
accurate to take w=1 for all a, rather than do the 
extra work needed to minimize this extra variable. 
This value of « means that the trial S, has the same 
time exponential in the interaction term as does S. 
For small a, that is, » near 1, the integral can be ex- 
panded in a power series in (v—1). The resulting 
energy is (w= 1): 


E= ~a—0.98(a/10)?—0.60(a/10)8 


~0.14(0/10)*---. (39) 


The two expressions (38), (39) fit fairly well near a= 5. 
For practical purposes it may suffice to use (39) below 
a=5 and (38) above. If more accuracy than 3 percent 
is needed near a=5 numerical integration of A must 
be performed. The value of » which gives (39) is 


v= 141.14 (a/10)+ 1.35 (a/10)?+ 1.88 (a/10)*. 


This may help to choose an appropriate v, For w=3 
the results are 


E= —a-1.23(«/10)?—-0.64(a/10)*- --, 
v= 3-+4+2.22(a/10)+ 1.97 (a/10)2: --. 


EFFECTIVE MASS 


Another quantity of interest is the effective mass. 
If the particle moves with a mean group velocity V, 
its energy should be greater. For small V the energy 
goes as V*, and writing it as mV?/2.we call m the 
effective mass, Since there is an operator analogous to 
the momentum which commutes with the Hamiltonian, 
it would be expected that there is a variational principle 
which minimizes the energy for each momentum. That 
is, we ought to be able to extend our method to yield 


7 E. Haga, Progr. Theoret. Phys. (Japan) 11, 449 (1954). 
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an upper limit to the energy for each value of V, or 
better, of momentum Q. We have not found the 
expected extension. 

If we limit ourselves just to finding the effective 
mass for low velocities, however, we may proceed in 
this manner: For a free particle of mass m whose initial 
coordinate is 0 and final coordinate is Xr the sum on 
trajectories is 

exp(—mX 7/27). (40) 


Hence we can study the effective mass for our system 
by studying the asymptotic form of (6) in the case 
Xr+0. The asymptotic form should vary for small 
Xr as exp(—Eol’—mX7*/2T), its dependence on Xr 
determining m. This only requires that (27) be solved 
for the boundary conditions X’=0 at t=0 and X’= Xr 
at t= 7. There are some confusing complications at the 
end points so it is easier to proceed as follows. We 
will put Xz=UT so that the propagation (40) is 
exp(—4mU?T). [Note that U is not a physical velocity 
because / is an artificial parameter in Eq. (5), and is 
not the time.] That is, we seek the total energy and 
equate it to Eot4mU?. But if we substitute X’= X” 
+Ut into (27), we see that it is a solution if X’’ is. 
This X” goes from 0 at ‘=0 to 0 at ‘= 7, and is there- 
fore our previous solution. Such a substitution into 
(26) means that the term involving f adds a term 
exp(/?1U-fdt) so that this is the factor by which J is 
multiplied, aside from normalization. For the f given 
in (25) this is exp[iK-U(r—c) ] so that we now have 


(expLiK- (X,— X-) ]) 


R 
=exp| -—F(|r-o)) +iK-U6ro) } (41) 


where 
Ye w 


F(r)=w*r+ (1~-e77). (42) 


v 


Substitution into (22) and (21) gives for A the value 
A(U)= 240 f fone: 
0 


R 
exp] ~ <P) +iK-Ur | aKa, (43) 


Second differentiation of (41) with respect to K shows 
that 
((X.— X,)*)= 3F (t—s)v?+ V2 t—s)?, 


so that one obtains for B the value 


We again find E; from dE\/dC= B/C and E,=4U? for 
C=0. Thus 
E,=$(0—w)+43U7(14+- 4Cw™), 
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and our final expression is 
E=1U?+ (3/40) (v—w)*— A(U). 


We next expand A(U) to order U? and write the 
kinetic energy as mU?/2 to find, finally, 


(44) 


Paes ae f [F(r)}etdr. (45) 
0 


The values of the parameters to use in (45) are those 
which were previously found to minimize E when U=0. 
For small a this gives 


m= 1+4a-+0,02502+ -- (46) 


for w=3, while for w=1 the 0.025 becomes 0.023. For 
large a it becomes 


m= 16a'/819'= 202(a/10)*. (47) 


Our energy values, coming from a minimum principle, 
are much more accurate than the mass values, whose 
precision, especially for large a, is hard to judge. Since 
(46) and (47) do not match well, intermediate values 
of a require numerical integration of (45). 

Lee and Pines® have worked with a different type of 
variational principle. It seems to be nearly as good as 
ours for a less than about 5, but is poor for larger a 
(for example, at a=15, Lee and Pines find Ey<~17.6, 
while we find Ey><— 26.8). This appears to contradict 
their statement that their method is exact for large a. 
They are referring to a different problem, however, 
in which the upper momenta are cut off. This means 
that in S in (7) the function | X,— X,|~! is replaced by 
some other function V(| X;—X,|) which differs for 
small | X,~— X,]|. It is evident, for large a, that the best 
trajectory will be the one that wanders only slightly and 
the energy will be 2~40V (0) in the limit. Their method 
gives this result in the limit, as ours would also. For 
the case where V is singular, so V(0) does not exist 
their method is not exact, and it is inaccurate for 
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for intermediate values of a even if V(0) exists, if V 
has steep walls. 

The method is readily extended to cases in which the 
photon frequencies are not constant, and the coupling 
is not just proportional to K~!. The same trial action 
S; can be used, but the integral for A becomes more 
complicated. For the Hamiltonian 


H=3P?+3 xoxaxtaxt+V—'D x[Cr*axt exp(—iK- X) 
+Crax exp(+iK- X)], 


Eq. (33) still holds; the only change is that the integral 
for A becomes 


a= ff exp| oer —F(s) |ICalPare Ke), 


where F(r) is given in (42), 

An attempt has been made to apply this method to 
meson problems, The case of scalar nucleons interacting 
by scalar mesons seems tractable, but the greater 
complexity of the more realistic problems shows the 
need for further development. 

We are limited in our choice of Si to quadratic 
functionals, for those are the only ones we can evaluate 
directly as path integrals. It would be desirable to find 
out how this method may be expressed in conventional 
notation, for a wider class of trial functionals might 
thereby become available. 

Tam indebted to H. Fréhlich for bringing the problem 
to my attention, and for his comments on it, and to 
G. Speisman for emphasizing the importance of the 
general inequality (10). 

Note added in proof.—Professor Frélich and Professor 
Pines have kindly informed me that S. I. Pekar [Zhur. 
Eksptl. i Teort. Fiz. 19, 796 (1949) ] has calculated the 
limiting values of energy and mass for large a, by an 
adiabatic approximation. The energy is — 0.10880? and 
the mass is 232(e/10)4. Therefore our variational 
method gives an error of only 3 percent in the energy 
and 15 percent in the mass for large a, and presumably 
smaller errors for smaller a. 
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We have obtained an approximate expression for the impedance 
function at all frequencies, temperatures, and coupling strengths 
of an electron coupled toa polar lattice (a system commonly called 
a polaron), The starting point for the calculation is the quantum 
mechanical expression for the expected current. The phonon co- 
ordinates are eliminated from this expression by well-known field- 
theory techniques. The resulting exact “influence functional” is 
then approximated by a corresponding quadratic “influence func- 
tional” which, it is hoped, imitates the real polaron. Correction 
terms are computed to account for the difference between the 
approximate impedance and the exact polaron impedance in a 
manner closely analogous to Feynman’s treatment of the polaron 
self-energy. In fact, the analytic evaluation of the expression for 
the impedance obtained here is carried out using the approximate 


I. INTRODUCTION 


N electron in a polar crystal interacts with the 
surrounding crystal. The effect of this interaction 
ig to surround the electron with a distorted lattice: a 
cloud of phonons. The nature of this system, “the 
polaron,” has been extensively studied.!~ It is interest- 
ing as a phenomenon in solids, but it has an extended 
interest since it is one of the simplest examples of the 
interaction of a particle and a field. It is in many ways 
analogous to the problem of a nucleon interacting with 
a meson field. (The extra complications of spin and iso- 
topic spin do not, however, permit direct use of the 
methods to be described here, without some extension 
of their power.) In cases of practical interest, the 
coupling between the electron and the longitudinal 
optical modes of vibration of the crystal is sufficiently 
strong that simple perturbation methods do not apply. 
It is the strong-coupling aspect of the problem which 
has aroused so much interest. For this reason, the 
“polaron problem” has generally been studied in a con- 
siderably idealized form. 

It is assumed that in the undistorted lattice the elcc- 
tron would move as a free particle (with possibly an 
altered mass), that only the optical modes interact with 
the electron, that they do so in a very simple way, and 

\R. P. Feynman, Phys. Rev. 97, 660 (1955), hereafter to be 
called I. 

2H. Frohlich, Advances in Physics, edited by N. F. Mott 
(Taylor and Francis, Ltd., London, 1954), Vol. 3, p. 325. 


3S. I. Pekar, Zhur. Eksp. i Teoret. Fiz. 19, 796 (1949). 
4T. D. Schultz, Phys. Rev. 116, 526 (1960). 


“influence functional” that was successfully employed in minimiz- 
ing the binding (and free) energy of the polaron in earlier calcula- 
tions. However, the accuracy obtained using this approximation, 
for the impedance calculation, is less satisfactory and its limita- 
tions are discussed. Nevertheless, beginning at intermediate 
coupling strengths, the approximate impedance produces a level 
structure of increasing complexity and narrowing resonances as the 
coupling strengthens. This suggests that further refinements may 
be fruitful. Methods for finding a better quadratic influence func- 
tional for use in our impedance expression as well as ways of 
improving the expression further are suggested. A comparison of 
our results with those of the Boltzmann equation points up 
interesting differences which arise from reversing the order of 
taking limits of zero frequency and coupling. 


that they all have the same frequency. These are quite 
drastic simplifications; however, sufficient data are not 
available to improve these assumptions so as to repre- 
sent any actual crystal. The methods given here do not 
require these simplifications (except perhaps that the 
electron’s kinetic energy is a quadratic function of its 
momentum); the same techniques can be readily applied 
to include variations of frequency and coupling of the 
optical modes with wave number, influences of other 
modes, etc., although some of the integrals done analyti- 
cally here might have to be done numerically. 

In discussing losses and mobility such idealization 
may alter completely the true behavior, because some 
essential loss mechanism such as lattice defects or inter- 
action with acoustic phonons has been idealized away. 
It is important to appreciate, therefore, that in all the 
remaining analysis and discussion we shall be talking 
only about a strictly idealized problem. 

One of us! has shown that the ground-state energy 
and effective mass of the polaron could be calculated 
with considerable accuracy from a variational principal 
obeyed by path integrals. Of more interest, experi- 
mentally, is the mobility of the polaron and, moro 
uencrally, its response to weak, spatially uniform, time 
varying electric fields. This is 2 more complicated 
problem involving the rate at which a drifting electron 
loses momentum by phonon interactions, through emis- 
sions of phonons or collisions with phonons already 
present. In the practical situation at temperatures not 
too near the melting point of the crystal the density of 
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optical phonons is quite low as a result of the high energy 
required to excite them. In our idealized model losses 
can occur only through collisions with optical phonons, 
so that these collisions could be analyzed by first finding 
the collision cross section and then using the Boltzmann 
equation (or equivalently the usual formulas for trans- 
port cross section) to get the mobility.® This is the 
technique generally employed in transport problems. 
Yet there exists a class of transport problems in which 
this cannot be done. If many phonons are colliding 
simultaneously with au electron most of the time, and 
if there are possibly quantum interferences among these 
collisions (such that the cross section for scattering. from 
one phonon depends on the presence and behavior of 
others), the collisions cannot be separated in time as 
required for the validity of the Boltzmann transport 
equation. What we need to calculate (the average posi- 
tion of an electron at time ¢, if at t=0 a pulsed electric 
field was applied) can be easily written formally, but 
little has been done with such a form unless the coupling 
is weak or the collisions are well separated.®’ 

A secondary interest which we had in this problem 
was to see if we could compute transport problems in 
cases when not only the perturbation theory, but also 
the Boltzmann equation is inadequate. Therefore, in 
spite of its lack of reality, we have analyzed the problem 
of the impedance of a polaron of arbitrary coupling 
strength in an oscillating electric field, for arbitrary 
temperatures (temperatures so high perhaps that the 
Boltzmann factor e~*#/*T foy the energy fw of the 
optical modes is not necessarily small). 

In any specific rauge of conditions, such as low tem- 
perature, high temperature, or high frequency of ex- 
ternal electric field, etc., special approximations might 
be made to obtain a better answer than is given by 
our general formula. However, it was of interest to see 
how well one could do in a general way for arbitrary 
values of the parameters. 


II. FORMULATION OF THE MOBILITY PROBLEM 
IN TERMS OF THE ELECTRON 
COORDINATES ALONE 


If a weak alternating electric field E = Eye is applied 
to the crystal in the x direction, the current induced 
(by motion of the electron) may be written as 


Ie) =Le) J Eve. (1) 


This defines the impedance function z(v) which we wish 
to calculate. We will assume that the crystal is isotropic 
so that j=(£), where (x) is the expectation of the elec- 
tron displacement in the x direction (taking the electric 
charge as unity). The displacement (x) is E/ivz(v). 

5 J. Howarth and E. H. Sondheimer, Proc. Roy. Soc. (London) 
A219, 53 (1953). 


°R. Kubo, J. Phys. Soc. Japan 12, 570, 1203 (1957). 
7™M. Lax, Phys. Rev. 109, 1921 (1958). 
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Transformed to time variables, this implies that 
(x(r))= = iG(t¥—o)E(a)do. (2) 


where G(r), i times the clectron displacement at time r 
induced by a pulsed electric field at time zero, has the 
inverse transform 


| G(r)e"*'dr =Gv) = [es J. (3) 


20 


We take G(7)=0 for 7<00 

The effect of a perturbing field E(/) in the x dircction 
is to add to the complete Hamiltonian of the system H, 
the term —xE(t)=—E-X (where x is the component 
of the vector position of the electron X in the direction 
of the field). If at some time (a), long before the field is 
turned on [i.e., E(#)=0 for t<a] the state of the system 
is represented by the density matrix pa, theu the deusity 
matrix at time 7 is U(r,a)paU’"'(7,a). Thus, the ex- 
pected position at time 7 is 


(x(r))= Tr[xU(7,2)pal’-'(7,a) J, (4) 
where 
U(1,a)= exp| -if [H.—X,: BCs) (5) 
is the unitary operator for the development of a state 
in time with the complete Hamiltonian H— X-E. 

We use a time-ordered operator notation; all un- 
primed operators are placed to the left, latest times 
farthest to the left, then the matrix p at the right and 
finally all primed operators on the right of p, with latest 
times farthest to the right.? Thus, primed operators 


are ordered oppositely to unprimed. We can, therefore, 
write 


Un ra)~ exp | CH X-EG) Ms}. 6) 


The quantity & is not an operator but simply a function 
of s so that in (4), £’(s)= E(s). However, as we shall 
see IN a moment, it is convenient to handle a more 
general case where / and E£’ are different arbitrary 
functions of s. 

For weak fields we expand (4) to first order in £ and 
find an expression for x(7) of the form (2). Evidently, 
—iG(r—c) is the response to a 6 function /, so we may 
set E(s)=6(s—c)=E(s), substitute into (4), and ex- 
pand the exponential to first order in «. However, we 
note that (4) itself may be considered to be —i times the 
first functional derivative with respect to E(r)— E’(r) of 


g=Tr[U(b,2)p.U"(b,2) |, (7) 


8 In our idealized model X and E will be in the same direction, 
although in general (in the presence of magnetic field or aniso- 
tropic crystalline fields), G and z will be tensors. 

9R. P. Feynman, Phys. Rev. 84, 108 (1951). 
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asb— +0, and a— — ~, That is to say, we calculate 
g from (5), (6), and (7) with 


E(s)= €5(s—o)+ 8(s—r) (8a) 
and 

E'(s)= €8(s—o)—78(s—7). (8b) 
The quantity we require is 

G(r—0) =3 (07g/ Onde) pnen0. (9) 


If the initial state is one of a definite temperature 
T, then 


pa= exp(—8H)/Q, (10) 


where 8=1/kT and Q is a normalizing constant, which 
we eliminate by calculating (1/2g)(d2g/dnde) evaluated 
at e=7=0. 

The Hamiltonian representing an electron in inter- 
action with the vibrational modes of a crystal is 


H=P?/2m+ Dox wxextax 
+V-'2 Sox [Cx*axt exp(—iK- X) 
+Cxay exp(tK: X)]. (11) 


In this expression, ax, ax are the annihilation and 
creation operation of phonons of momentum K, fre- 
quency wx, coupled to the electron via the coupling 
coefficient Cx; P is the momentum of the electron; 
X is its coordinate; m is its effective mass calculateed in 
a fixed lattice; V is the crystal volume. We take 
4=1, m=1. 

As a specific example we shall take the simplified 
model of Frohlich? in which wx=1 independent of K, 


+ CECE [een aw. 


0 


aK 
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and Cy =i2"/47'/2q1/2/) K], where aw is a constant related 
to the dielectric constant; intermediate coupling corre- 
sponds to a~6. 

The quantity p., the initial distribution, should be 
e-*# for the full Hamiltonian H. If the time (a) is 
sufficiently far in the past we can just as well take 
pa=constX exp(—8 Xx wxaxtax). That is, we may 
assume that in the past only the oscillators were in 
thermal equilibrium at temperature 6-'. As a result of 
the coupling, the entire system will come very quickly 
to thermal equilibrium at the same temperature. The 
energy of the single electron and its coupling are in- 
finitesimal (of the order 1/V) relative to the heat bath 
of the system of phonon oscillators, so that the exchange 
of energy between the electron and the lattice will bring 
everything to thermal equilibrium at the original lattice 
temperature. 

With this choice of pa the dependence of U, U’, and 
pa in (7) on the phonon oscillator coordinates is suffi- 
ciently simple so that the oscillator coordinates may be 
eliminated and the entire expression reduced to a double 
path integral involving the electrons coordinates only. 
This reduction, explained in Appendix A, is carried out 
by methods analogous to those used before by one of 
the authors on problems in electrodynamics.? 

The result is (taking a —©, b> +40) 


g= J | e?D X(t)D X(t), 


(12) 


where 


E’()- X’(O ]dt 


7 ad +20 
= i aot [. {exp[iK- X(4) ]—exp[zK: X’()]} X y(@x, ¢—5) 


X {exp(—iK- X(s)]}-exp[—iK- X’(s)}+-ta(wx, t—s){exp[—7K- X(s)]—exp[—iK- X’(s) J} Jdsdt. 


(13) 


The functions y(w,r) and @(w,r) are given in Appendix A. In the special case of Fréhlich’s Hamiltonian the integral 


on K can be performed to give 


tri 7dX()\? 17dX'(y? 0 
ee ON at EQ): X@-E'()- XM 
i fe LA dt ) A di ) Jeo a CE®: X@Q—- EO: XO jae 


etils| 4. 2P(8) cos(¢—s) 


+2 +0 -¢-lt-sl4 9 P(B) cos(t—s) 
. 2-3/2 + 
rae a i F l | X@— X(s)| 


where P(6)=[e®—1}-'. 


|XX) 
a[e-*-) 4 2P(8) ee (14) 
5, 
|X'—X(s)| 


The double integral D X()DX’() is only over those paths which satisfy the boundary condition X(¢)— X’()=0 
at times ¢ approaching -: 0, The boundary conditions on the paths at large positive or negative times, reflects 


the arbitrariness of the initial electron state. 
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Thus, we have reduced the problem of find G via (7), 
(8), and (9) to that of finding the dependence of a path 
integral (12) on the forcing functions Z and E’, This 
expression is exact [for the Hamiltonian (11)] but quite 
complicated. In the next section we discuss approximate 
methods of evaluation. 


Ill. A METHOD OF APPROXIMATION 


In I, a path integral, similar to (14), had to be 
evaluated. It was argued there that in some rough 
approximation the ‘“‘interaction of the charge with 
itself”? represented there by a term in the action function 
S, 2-#%me-l-4l| X(z)— X(s)|-}, might be imitated by a 
function So in which this term is replaced by 
1/2Ce—“l#I[ X(4)— X(s) 2. One may think of the inter- 


(EG) Fa) PL 0x0 
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action term in S as indicating that at time / the particle 
acts as though it were in a potential 2-'?af-Ze7(") 
«| X(@— X(s)|—ds_ resulting from the electrostatic 
interaction of the electron with its mean charge density 
of its previous positions [the weight for different times 
being e~¢-]. The assumption then is that such a 
potential may be roughly replaced by a parabolic 
potential centered at the mean position of the electron 
in the past [the weight for different tines being 
e~(t~a)] Tn fact, the extra parameter w can be adjusted 
to compensate partly for the error of using a parabolic 
potential in place of the true potential form. This 
argument strongly suggests that the dynamical belsavior 
of the electron, (its motion under an applied electric 
field) might be described approximately if we replace 
® by a &p, where 


E(t): XO) Jdt 


‘Co opte pte 
2 J 1 {(X() — X(s)}¢[e-#" | 814-2 P(Bw) cosw(t—s)] 


+({X'()— Xs) }'Let "4-2 P(Bw) cosw(t—s)] 
—2EX'() — X(s) Pe -) 4. 2 P(Bw) cosee(t—s)}}dtds, (15) 


The parameters C and w are to be determined so as 
to approximate ® as closely as possible. At zero tem- 
perature (P=0), we shall fix C and w at the values 
given in I. The assumption that @ is a good approxi- 
mation to ® for computing the mobility at low tem- 
peratures is based on the supposition that the compari- 
son Lagrangian, which gives a good fit to the ground- 
state energy at Zero temperature, will also give the 
dynamical behavior of the system. In finding the 
ground-state energy, the parameters can be chosen by 
a variational principle but we know of no such principle 
for the mobility. At finite temperatures the parameters 
C and w can be determined from a variational principal 
for the free energy which is a direct extension of the 
method used in I for the ground-state energy, and re- 
duces to it in the zero-temperature limit. Others!®.! 
have derived in detail the expressions from which the 
best C and w may be determined for finite 8. Thus, C 
and w can be considered as known functions of a and 8 
even though, unfortunately, no closed analytic form 
exists, and in any specific calculation they would have 
to be evaluated numerically, 

Actually, we shall not be satisfied merely to replace 
® by &, but we shall obtain a first correction to 2(v) 
by studying, in the next section, the first term in an 


*Y. Osaka, Progr. Theoret. Phys. (Kyoto) 22, 437 (1959). 


® M. A. Krivologz and S. I. Pekar, Bull. Acad. Sci. U.S.S.R. 
21, 1, 13, 29 (1957), 


expansion of exp[i(—4,)]: 


c= [ [eroxow's [ fewoxow 


+ [eri@—ayoXo X’, (16) 


In this section, however, we will consider only the 
first term, 


goo [ eoXoy. (17) 


We can expect to evaluate the integral (17) exactly, 
because the expression for ®9 is a quadratic form in 
X(#) X‘() and all such “Gaussian” path integrals can 
be evaluated exactly.® There are several ways to perform 
the integration in (17). One way is to observe that the 
expression (15) is obtained by eliminating the variable 
Y from a system in which an electron interacts with a 
single particle described by the Lagrangian 


Lo=4}(dX/dt)?-+4(d Y/dt)*?—3k( X— Y)°+ E-X. (18) 
If we calculate the g for such a system by integrating 
over all the Y variables first, then (17) results; provided 
we choose k=(s?—w?) and M=(v?—w?)/w?, where 
v?=w?+ 4C/w. However, Ls can be re-analyzed as the 
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sum of two normal modes, 
=4}(M+1)(d Xo/dt)?4- E- X.+—__- 
2(M+1) 


M 
X,/dé)?—-v Xe J-——_E- X,, (19 
x((dX1/dt) ?] aay (19) 


so that gq can be written as the product of two factors, 
one for each harmonic oscillator. For a single oscillator 
of mass m and frequency w, coupled as '(¢)- X, the 
value of g is given in Appendix-A. Here we have two 
oscillators, one of mass m= M-+1=1?/w? and frequency 
w,=0 coupled with a F'(é)= E(), and a second oscillator 
with m= M/M-+1, w.=9, and P'(t)=—(M/M+1)E() 
= —[(v?—w?)/v? JE(/). If the contributions of the two 
normal modes are combined, go takes the form 


1 
o= exp(— 
4 


T 


+00 
Ch») f'(—) KOP)+ £0) Vo) 


-—* 


+i€/0)~ fo) dalo)idr) (20) 
where!! 
Yq(v)= — (vb? —w") /(v—ie)*[(v—ie)?—v?] (21) 


and 


w( 2w? 
lov) = “| 
2 v%Be? 


(2—w? 


x ‘tovt+a0—9i}. (22) 


v 


We have expressed E(/) and E’(é) by their Fourier 
transforms 


Ho 
fo= | E(e-*'dt. (23) 


To obtain Go(r—a) we must evaluate go for E and E’ 
given in (8), that is to say, we must substitute 


fo)= ene, 1 
f(r) = ee7e— ne, 


into (20), and find the term of order en. Evidently 


+i pt ; 
Gi(7-0) =—- | Vo(vjet*-O dy, (25) 
2m J—x 


where Go(r—¢) is the zeroth order approximation to 
G(r—«) (Eq. (2)]. Therefore, 


Gor) = +1Y (vr). (26) 


Since Yo is the classical response function for the 
comparison system Lo, the result has the immediate 


1g ig a small positive quantity and the limit ¢— 0 is to be 
taken. 
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interpretation that Go(v) is the response we would have 
predicted for the system Ly had we treated it classically. 
In addition, there is no temperature dependence in Gop 
(except through the variation of the parameters 2, w with 
temperature), Both of these well-known results follow 
from the linearity of Lp.!2-! 

For a particle of mass m, for low frequencies, 
G= —1/mv? so that a comparison of this expression with 
(21) and (26) gives an effective electron mass m= 1?/w?. 
This value for m is not the same as the more accurate 
value given in I, but as Shultz has shown, it is numeri- 
cally not very different over a wide range of a. Thus, 
the reactive part of Ga may be satisfactory, but the 
dissipative (real part) appearing as it does all at the 
single frequency v, must be only a very crude 
approximation. 

In the next section we shall compute the corrections 
implied by the additional expansion terms in (16). We 
shall find that the mass is now exactly that given in I, 
and that the dissipatiion has a much more realistic 
behavior, 


IV. FIRST CORRECTION TERM 


To evaluate the second term on the right-hand side 
of (16) we shall have to integrate e*0(— 4 ), In order 
to see what is involved, consider only one of the terms 
arising from @e*#o: 


aK 
r fol 
(2r)* 
7 


XDLy(ox, t—s)+ie(wx, Islan] 


i exp {5K-[X()— X(s)]) 


XDX()DX'(t). (27) 


Other terms from ® are similar to (27) with some re- 
placements of X by X’, while the terms from py we will 
consider later. Evaluation of (27) requires a knowledge 
of the path integral 


R(K,s)= i e0 exp {iK-[X(1)— X(s) JDXDX’. (28) 


Once R(K,t,s) has been evaluated, (27) becomes an 
ordinary multiple integral: 


a he 
(29) 


Cx! a [ R(K 3) 

(2m)* = 
XDy(ox, (—s)+2e(wx, i—s) ]dsdt. 

12 F, L. Vernon, Jr., Ph.D. thesis, California Institute of Tech- 

nology, 1959 (unpublished). 

UR. 


W. Hellwarth, Hughes Research Laboratories, Fourth 
Quarterly Progress Report, September 15, 1958 (unpublished). 
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The function R is easily evaluated. It is clearly given by 
our general formula (20) with 


fo) = (ene EE Kee) (30) 


and'4 


f(v) = (ee~”7—~ ne~*)f. 


(31) 


Similarly, a term of the form (27) with X(¢) replaced 
by X’(¢) can be expressed in terms of our general path 
integral (20) by using the proper f’s, 


f(v) = (ee~ #74 ne—#)t— Kee (32) 


f'(v) = (ee — ne-r)t— Kem, (33) 


In this way, fe*bD XD X’ can be evaluated. Similarly 
the term /e'*d,D XD X' may be obtained. To get 
S eo X(t)— X(s) PDXDX', one can differentiate R 
with respect to K twice and evaluate it at K=0. 
(Details are given in Appendix B.) 

The final result for the first-order change in G is 


and 


Gy= —i¥ (rv) xv) + (4C/w)r?/(v?-w") J, (34) 
where 
x(v)= | (1—e*“] ImS(u)du, (35a) 
and : 
aK 2K? 
(27) 3 
(exp (twxu)+ 2P(Bwx) cos(wxu)], (35b) 


“Im” means the imaginary part and the function D(x) 
is defined as 


w? v—w ; 
p=] [i~—etir« 


vl wy 


+4P(6v) sin?(vu/2) ]—iu+ “/B (35c) 


For Fréhlich’s Hamiltonian, the integration over K may 
be done to give 


S(t) = 2a/34/r{ LD(u) }-Le**+2P(8) cosu]}. (36) 
We have found an approximate form for G(v): 
G(v) = Go(v)+G,(r). (37) 


From it we may find the impedance to first order in 


Gy(v): 
v2(v) = 1/G(v)~ 1/Go(v) — [1/Ga(v)* 1G). 


‘The question arises as to whether it is more accurate to 
expand in this way or to leave the formula as vz(v) 


(38) 


4 Equation (20) is a one-dimensional formula. For the case of 


vector forces the product of two f’s is to be interpreted as a dot 
product. 
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= (Got+Gi)~!. Of course, if G, were truly small it would 
not matter. However, there are excellent reasons to 
believe that the expanded form is far more accurate. 
This is best explained by considering a simple example 
of a free particle to which we add a harmonic binding 
as a perturbation. The resulting G’s are [4 arising 
from 3m(dX/dt)? only], 

Ga(v) = + t¥a(v) = i/ mv? (39a) 
and 


Gy(v) = —iwe?/mr4, (39b) 


where wo is the natural frequency of the oscillator. In 
this case the expanded form of z(v) is 

iva(v) = m(wo?—v?). (40) 
The true G(v) shows a structure (resonance at »=wy) 
which is not reflected in an expanded form of G, but 
which is precisely duplicated (for this linear system) if 
one expands 2(y). Therefore, we substitute (34) into (38) 
to obtain the simple result, 


(41) 


With x(v) given by (35), this is our final expression for 
the impedance of the polaron. It is Eqs. (41) and (35) 
which we will evaluate in various limits and discuss in 
the following sections. 

The first term on the right-hand side of (41) is a pure 
free-particle term, while x(v) contains all of the correc- 
tions due to the interaction with phonons, The entire 
dependence of our results (41) on the trial action ®o is 
in D(u), Eq. (35c). D(#) in turn appears only in the 
exponential term in Eq. (35b). This exponential is an 
effect due to recoil as can be seen by expanding the 
exp(7K- X) term in the Hamiltonian as 1+7K- X (which 
we may call a dipole, or linear coupling approximation). 
If the expansion is made, then the exponential term 
e~K?D(w)? in (35b) will not appear. In other words, if 
we had any problem in which the field oscillators were 
coupled linearly to the electron’s coordinate X, then 
our method would give us the exact formula for the 
impedance irrespective of the choice made for the trial 
functional ®,.'° This is fact in the best argument for 
treating the perturbation expansion (16) as an expansion 
for z(v) in the manner of Eq. (38). 

Therefore, insofar as the system of phonons behaves 
as though they were linearly coupled, so there were no 
recoil effects, (41) is exact. However, recoil effects are 
included in (41); it is only that they are not included 
precisely. They are approximated by finding their effect 
for the imitative functional @y rather than the true 
functional ®. For this reason we expect (41) to be an 
excellent approximation to the true impedance of the 
polaron. 


—iva(v)=v2?—x(v), 


15 Assuming that 9 also implies a linear coupling and is a 
quadratic functional of X(#), X’(A). 
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V. BEHAVIOR OF THE IMPEDANCE 


For purposes of further analysis in which we shall 
change the contour of integration on the variable u, we 
list some properties of S(u). For real u, S*(u)=S(—x) 
and S(iu) is purely real. In addition S(u)=S(iB—u) 
for complex «. The real part of D(«) must be positive 
in order for the K integral to converge. Therefore, the 
region of « where this happens, namely the strip parallel 
to the real axis between the lines w=real and 
u=real+ if, is the region free of singularities for S(z). 
In the limit of zero temperatures (8—> ) this strip, 
over which S(z) is analytic, widens to include the entire 
upper half-plane. 


1. Zero Temperature, v<1; Effective Mass 


For Fréhlich’s case we consider first the case »<1, 
B=, Then the path of integration [in the integral for 
x(v)] along the real axis may be rotated to the path along 
the positive or negative imaginary axis u~0 to iw 
(depending on the sign of e*"). The resulting expression is 


[eis coat) 
—two(v) =v? ——-— e~?(1—coshyr 
a, 


v—w? w? 7-32 
| = | dr. (42) 
v vw 


Therefore, z is purely imaginary for »<1 and there is 
no dissipation at the absolute zero of temperature (a dc 
field will continue to accelerate the electrons indefi- 
nitely). The reason for this behavior is simply that there 
are no existing phonons for the electrons to scatter off 
and none can be created by the electrons until the 
frequency » of the applied field is high enough to excite 
the electrons to a state of energy #v higher than the 
energy fw needed to create a phonon. If there is a range 
of frequencies w down to zero (as for acoustic modes) 
then a reSistance exists at any frequency of the applied 
field and at zero temperature. In the Fréhlich model it 
begins at »=1. Of course, a dc field will eventually 
speed the electrons up until they can radiate phonons 
and dissipate energy. However, this is a nonlinear effect 
in the applied field strength and is not described by a 
theory of the impedance. For extremely low frequencies 
y we can put (1—coshyr)= —»?r2/2, The result is that 


00, 


ivz(v) = | 1+ - | we-™ 
3/m Jo 
w 


Cpe ST" 


The polaron behaves like a free particle with an effective 
mass. This mass is the same as the one derived in I, by 
a modification of the variational ground-state energy 
calculation, 
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2. General Expression for Dissipation 


The analytic properties of S(u) outlined at the be- 
ginning of this section allows one to rewrite the expres- 
sion for the Imx(v) in a form more convenient for com- 
putation. We may write (35a) as 


LO =hs / 


0 


sin(v)S(u)du. (44) 


Using the fact that S(w) is analytic between w= real and 
u=real+i8, we may change the contour of integration 
in (44) from along the real axis to one which goes first 
from 0 to i8/2 up the imaginary axis and then from i8/2 
to 18/24 parallel to the real axis. (The closing piece 
of the contour required at infinity does not contribute.) 
Because S(iu) and sin(iyu)d(iu) are real, the leg of the 
contour up the imaginary axis contributes nothing to 
(44), which requires the imaginary part. The contribu- 
tion from the remaining part of the contour (from 


18/2 to iB/2+%) gives 


LJ 


Imx(v)= snh( 6/2) cos(vu)Z(u)du, (45a) 
0 
where X(1)=S(uw+i8/2) is given by 
siw= f Ice 
(21)* 
; coswxu) exp[—}K?A(u) ] (4sb) 
sinh(Bwx/2) 
and A(u)= D(u+i8/2). 
wr /v?—w* cosh(8v/2)—cos(vu) 4? pe 
A(u) {( ) : +— + (45c) 
v wy sinh(8v/2) Bp 4 
The de mobility u» for the polaron is given by 
= on Imx(v)/. 
Our rest ts (44), therefore, gives 
wit if "(wd (46) 
0 


For the case of Frohlich’s Hamiltonian, we find that 


2a p*¥? sinh(Br/2)/ 9 
Imx(o) == ) 
3x sinh(8/2) \w. 
[ee __cos(vt) cos(u)du cos(u)du (410) 
(w+a?—b cos(vu) >? 
whiere 


a?= 62/44 RB coth(62/2), 


b= R8/sinh(60/2), pe) 
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and 
R= (v?—w?) /w. (47c) 


3. Dissipation at Low Temperatures 


For low temperatures e~6 and, therefore, e~8 are very 
much less than one, so that (47a) may be expanded as 
a power series in 6. 


2a sinh(Bv/2)/ v7\* 7? cos(vu) cos 
Im)==— ——_—__{~) | Te 
3\/m sinh(B/2) \w/ Jo (u?+a?)3/2 
3BRe-*"!2 cos(vu) 15 B2R%e~8 

O fect nicta 

(W+a*) 4 (w+a’)? 


(1-+cos(2vu)]+--- jax. (48) 


Now an integral like /cos(Au)du/(u?+a?)*/2 falls off 
exponentially like (27|\|)'/2(a)~%/e-!4I2 as increases. 
Thus, the smallest values of \ count, and these count 
with the smallest power.e~® in front. This permits us to 
select the important terms for each v. For example, the 
last term in brackets contributes when »v~22+1 for 
there is a contribution from cos(v%) cos() cos(2vu). The 
e~® is compensated for by the e—)/? in front. 

For »<1+2, and |1—»| >, only the first term in 
the expansion contributes and we obtain 


v\3 
Imx(v) = in(~) (1— e788 o-)/2 
w 


xC(le—-1 |) /2eBly—Al/2e— Rly 


A$ 1) M2Q~8 OH 29- ROT), (49) 


As 8 > ~ this shows a threshold at »=1. Below »=1 
the result is nearly zero; above, it is ~3a(v/w)3(v— 1)? 
Xexp[—R(—1)]. This is the threshold to create one 
optical phonon from the energy quantum #» supplied 
by the external field. If we have an excess energy »—1, 
the final electron has momentum proportional to 
(v—1)'2 and this appears as a factor because of the 
phase space available. The cross section depends in 
some way on the frequency above threshold; the factor 
exp[—R(v—1)] is a rough approximation to this, 
generated by our model ©. 

To study the dependence on § in a little more detail, 
in case y<1, (49) can be rewritten as 


Imy(v) = 2a(n/w)3[(1— v) /2¢-8 Gre ROW) 
+ (1p) 2e~ BUH eB (J —p) 2g RU) eB 


= a ty) 2e~ RUtNe~BUtN (50) 


The terms are easily understood. In the first, an electron 
absorbs a quantum of energy » to emit a phonon with 
energy one. The chance that the electron has enough 
energy, 1—», to do this is e~8“-"), In the second, the 
electron absorbs a quantum v and also absorbs a phonon 
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(they are present in number e~*) so the outgoing electron 
momentum is (1+,)'/?. The third corresponds to the 
electron absorbing a phonon and emitting a quantum 
ty to the electric field. This emission contributes nega- 
tively to the energy loss of the electric field. The fourth 
term results from a particularly energetic electron of 
energy exceeding 1+» radiating a phonon and emitting 
a quantum to the field. 

Since the dc mobility is given by the Imy(v)/» as 
v— 0 (50) gives an expression at low temperatures for 
the dc mobility of the Fréhlich model. Using our trial 
functional &9, we find that 


w 3 
-(~) (3/4a)e8 eed iuts, (1) 
v 


The dependence of (51) on the coupling strength a is 
as a) for small a (and high 8) because the “best” wv 
for small « (see I). This dependence on @ is of course 
the same as is derived by perturbation theory. As a 
becomes large (2>>1) the best parameters satisfy the 
relation v/w~a?. Therefore, the mobility » becomes 
proportional to a~"e* at high coupling strengths. The 
result (51) can not be compared directly witli the results 
of previous calculations*'-\7 because its temperature 
dependence is e8/8 rather than the e? dependence found 
in the other approaches. At high 8 (where these previous 
calculations are valid), the different dependence on 
temperature would be experimentally unobservable. 
However, the origin and significance of this (incorrect) 
temperature dependence is interesting and will be dis- 
cussed at some length in a later section. 

Returning to our general expression for the Imx(v) 
(48), we see that there are other thresholds at higher 
frequencies coming from higher terms in the sum (48). 
For B= the next threshold is at y=0+1, the contribu- 
tion above threshold being (2a/3)(v/w)3(v—v— 1)*? 
Xexp[—R(v—v—1)]. These higher thresholds corre- 
spond to exciting the electron to an excited state of 
energy v and emitting a phonon. The position of this 
excited state (at v) and the higher ones along with the 
selection rule that says these cannot be excited without 
the emission of a p-state phonon is a fiction supplied 
by our imitating action ®p. 

Of course, for strong couplings, there will be such 
complicated excited states, with partial selection rules 
leading to a complex curve for Imx(v). For strong 
electron-phonon coupling, the electron is in effect bound 
in a potential which it makes by distorting the lattice in 
its neighborhood. If the lattice were held fixed in this 
distorted state, we would expect the electron to have 
various excited states in this potential. In fact, the 
lattice moves so that “states” are unstable, but for 
large a the excitation energies are of order a? larger than 
the lattice frequency so it cannot follow quickly enough. 


16 F, Low and D. Pines, Phys. Rev. 98, 414 (1955). 
1'Y, Osaka, Progr. Theoret. Phys. (Kyoto) 25, 4, 517 (1961). 
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3 4 5 7 8 9 


6 
v 

Fic. 1. Plot of Im x(v) fora=3, w=2.5, v=3.4 and B= 100. vis 
plotted in units of the optical mode frequency w and Im x(v) is 
plotted in units of (s/w). 


The Imx(v) should have a maximum when » is equal to 
a frequency which can be absorbed in going to such an 
excited “state.” The widths of these maxima reflect 
the lifetimes of these “‘states” for phonon emission. 
Naturally, we cannot expect our approximate formula 
(35a) to give such detailed results correctly.'® It is 
reassuring, however, that our method gives such a 
realistic looking behavior, and strongly suggests that it 
represents a long step forward toward the correct 
Imx(v). In the last section we outline some ways of 
improving the Imx(v). When the coupling is not too 
strong (a~3), these thresholds are weak and hard to 
see, and the curves will have a “‘washed out” appear- 
ance. In Figs. 1-3 are given curves of Imx(v) vs v for 
a=3, a=5, a=7 and for low temperatures (8= 100). 
These figures show the resonance effects very nicely. As 
a functon of frequency each curve consists of maxima 
of increasing width. As a function of a the curves show 
more maxima of decreasing width as @ increases. All 
of the curves were computed numerically on an IBM 
7090 computer, using an infinite power series expansion 
of (47a) in terms of K functions (Bessel functions of 
imaginary argument). The values of the parameters used 
were w= 2.5, v= 3.4 fora=3, w=2.1, v= 4.0 fora=5, and 
w= 1.6, v=5.8 for a=7. 


Fic. 2. Plot of Im x(v) fora=5, w=2.1, v=4.0, and B= 100. v is 
plotted in units of the optical mode frequency w and Im x(v) is 
plotted in units of (m/w). 


13 Pekar, in the strong-coupling limit, using Gaussian trial wave 
functions for the electron, finds an excited state at precisely v. 
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Im x(v) 


18 «620 
v 


22 24 26 28 


Fic. 3. Plot of Im x(v) for a=7, w= 1.6, »=5.8, and B= 100. pv is 
plotted in units of the optical mode frequency w and Im x(v) is 
plotted in units of (s/w). 


4. Behavior at High Temperatures 


Yor high temperatures, 8 is small and (disregard for 
a moment the variation of w/v with temperature) A(z) 
varies like #?/8. Therefore, only small % will be of im- 
portance in the exponent, and we can expand A(u) 
(45c) as 


A(u) = 1/B[(w?+6?/4) 
— (v®—~w?/12) (a? +-B2/4)24 +--+]. (52) 


The leading term is, therefore, that of perturbation 
theory, and our formula is insensitive to the trial func- 
tional ®y. Actually this is even more accurate than it 
appears, because as T rises the parameters v and w 
change in just such a way that v?—w? falls making the 
approximation (52) still better.10-10 

At very high temperatures the perturbation theory 
works because the electron has on the average, an energy 
high compared to the lattice frequency. In this case the 
effective polarization should fall to zero in the limit of 
infinite temperature. We, therefore, expect that the 
accuracy of our formula will increase as the temperature 
rises. 


VI. WEAK-COUPLING LIMIT; THE 
BOLTZMANN EQUATION 


In the limit of weak coupling (Cx small or @ small) 
the “best” model parameters are v= w or C=O. That is, 
the model of (15) becomes the bare free electron: 
Perturbation theory is also simply the expansion (16) 
but with ®, just the free-electron influence functional 
{ie., D(u)=—iu+u*/B}. Therefore, our series 
Go+Gi+--- for the admittance agrees with perturba- 
tion theory in the weak-coupling limit. 

One would expect then, that in the limit of weak 
coupling the mobility in constant fields obtained here 
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would be exactly that obtained from the Boltzmann 
equation using perturbation theory for the electron- 
phonon scattering cross sections. This turns out to be 
not true; the question being whether the limit » > 0 is 
taken before or after the limit of weak coupling 
{Cx|?— 0 is taken. A detailed comparison of our result 
to that of the Boltzmann equation for weak coupling 
is instructive. 

The Boltzmann equation for a particle with mo- 
mentum distribution f(P) in an electric field E(t) in 
the x direction is 


ap/ae+-ba4/aPe=— | (PPA) 


—y(P! > P) f(P)]a*P'/ (21). (53) 


In (53), y(P—> P”) is the probability per second that an 
electron of momentum P is scattered to momentum P’ 
by collisions with the phonon gas. This rate, using the 
usual perturbation theory, is 


(PP!) =2n|Cx| °C e-P)“15(4 P44 wx) 
+ (eeK—1)-10(P2-4P%— wx) (54) 
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The first term in (54) is just the probability to emit a 
phonon of momentum K=P’— P. The second term ié 
the probability to absorb a phonon of momentum K. 

The Maxwell distribution, fo(P)=(2x/8)-*/e-6P"/?, 
is a solution of (54) with no electric field since 


y(P— P’) fo(P) =7(P" > P) fol P'). (55) 


If E is a very weak, spatially uniform field varying 
as E= Eye**, the deviation of f from equilibrium can 
be written as f= fol1+ Eoe**P.k(P)], where i(P) is a 
function of P? satisfying [using (54) ] 


Gnroree— [WP P’) 


X(P.h(P)— P./h(P’) a3 P'/(2r)8. (56) 
The current j is — Eye*"* f/ P2h(P) fo(P)a*P, so the im- 
pedance is given by 1/z(vy)=—SP2h(P)fa(P)a'P. 
Multiplying (56) by P./(P) fo(P) and integrating over 
all P, we find another expression for the impedance: 


1 
2 | | fo(P)y(P > PL P.A(P)— P.h(P’) Pas P'd P+ iv i PhP) fo P)d3P (20)? 


2(v) = 


(on)9] i P2h(P) fpr? | 


The integral equation is quite difficult to solve in 
general. We will content ourselves with an approximate 
analysis, The expression (57) has been written so that 
it is stationary’? for variation of A about the true 
solution (56). That is, errors in # will appear only in 
second order in 2(v). 

The simplest approximate solution to (56) is that 
h(P) is a constant. Then (57) gives 


mak (58) 
where 
B 
Pao J i folP)y(P > P?) 
x (P.—P.!)2d8PBP'/(2n)*. (59) 


This represents a frequency-independent pure resistance 
IT. Thus, tI should be compared to the x(v) of (35a). 
To do so we substitute (54) into (59). One sees that the 
first term in brackets in (54) gives the same contribution 
as the second, so calling P’=P~K we get (replacing 


If »y=0 (57) is a minimum, giving the useful variational 
principle for w-! discussed by Wilson [A. Wilson, Theory of Metals 
(Cambridge University Press, New York, 1959), p. 300}. 


(57) 
K.? by K?/3) 
B 
r= K2|Cx|?P(wx (P 
oar |Cx|?P(wx) fo(P) 
6(3(P— K)?—3P?—wx]dPaK. (60) 


Next we replace the 6 function by 4(x) = f_Se™*du/2z. 
The P integral is then readily evaluated to give 


P= | dK (2n)-*|Cy| 24K? | is Pex) 


~-e 


Xexp[—ituwx—4K%(—iu+u?/p)]. (61) 

Replacement here of « by «+78/2 shows that '= yo! 
from (45a) if A(z) takes on its free-particle value, 
w?/B+B/4. 

This result is what we expected but there are two 
points to be made. Firstly, we made no assumption that 
vy was small in solving the Boltzmann equation. Why 
then do we not find that [=Imy(»)/v as a function of 
frequency instead of a constant, the limit as » 0 of 
Imx(v)/v? The answer does not lie in trying to solve (56) 
more exactly, for if we take a high-frequency case so 
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that the collision term in (56) is negligible compared to 
the iv term, in first approximation h=(8/iv) and is 
indeed constant as we assumed. That is, (58) is a closer 
approximation to the prediction of (56) the higher the 
value of » relative to I. The answer is that the original 
formulation of the Boltzmann equation is faulty at high 
frequencies. It is assumed that the collisions are made 
and that between collisions the particle drifts in the 
electric field. But at higher frequencies new processes 
are possible in which, for example, the electron absorbs 
a quantum 7 from the field and radiates a phonon. In 
the quantum theory for higher » this cannot be analyzed 
as the succession of two independent events. Therefore, 
at the higher frequencies we may use our formulas (35). 
Tf the results deviate from that of the Boltzmann equa- 
tion we must conclude the latter is inaccurate. 

The second point to discuss is this. We did not solve 
the Boltzmann equation exactly; presumably, therefore, 
T' is not exact. Why then is our result y~ from (51) not 
asymptotically exact as a — 0, in spite of our argument 
that our formulas should be correct in perturbation 
theory? The reason is easy to see from (57). For finite», 
as the coupling gets weaker the collision term falls below 
»v and I’ is in fact exact. Thus, for any » other than zero 
in the limit of infinitesimal coupling our result (45) is 
exact. However, for v=0, to get the exact answer no 
matter how small the coupling, the full Boltzmann 
equation must be solved and our result for », (45), is 
only an approximation. [Mathematically, the lack of 
uniform convergence arises when we invert G and 
expand, because the resistive part of x(v) exceeds the 
leading (reactive) term »? no matter how small the 
coupling is if »=0.] 

Although not exact, our result (45) for yw is still a 
good approximation to the solution of the Boltzmann 
equation. The value of y~ obtained for Fréhlich’s model 
at low temperature from (51) varies as 8 'e~8, while 
from the Boltzmann equation we know that it should 
vary as a constant times ¢~*.5 But because of the rapid 
variation of the exponent these two are hard to dis- 
tinguish (for example, the temperature at which the 
mobility reaches a given value is imperceptibly different 
in the two cases). At higher temperatures our results 
for 4! (46) no longer behaves as e~8/8. In this case the 
values obtained from (46) and the Boltzamnn equation 
would come closer together. (At extremely low tem- 
peratures Fréhlich’s model, of course, fails. Although 
acoustic phonons are not very effective, they cannot be 
disregarded for there are virtually no optical phonons 
excited.) As a test of our approximate solution of the 
Boltzmann equation we have also analyzed a system 
interacting at high temperature with acoustic phonons 
with |Cx]? proportional to K?, wx= Koo, and B-!> mo,?. 
Such a coupling leads to a relaxation time for the elec- 
trons which varies inversely with their velocity. In this 
case p is proportional to § in either theory, but Eq. (46) 
gives a result 32/9r or 13% higher than the more 
accurate solution of (56) given by (59). 
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VII. SUGGESTIONS FOR IMPROVING ACCURACY 


The entire dependence of our result (35) for the 
polaron impedance on the imitating quadratic influence 
functional ®9 is contained in D(z) which is expressed in 
terms of the model’s response function, Yo(v). If in the 
expansion we used a more elaborate (but still quadratic) 
@y, then all results would be the same but with a D(x) 
coming from a different Yo(v) in (35). What is the best 
Y,(v) to take? We note that Yo(v) was also the first 
approximation to the desired function Y(v) =[ivz(v) J. 
A natural suggestion, therefore, is that the best Vo(v) 
is the “true admittance function of the real polaron’. 
Since the true Y(v) is unknown, perhaps the next best 
alternative would be to use a Yo(v) in (35) such that 
the 2(v) itself equals (i»Yo(v)}-'; that is to use a Vo(v) 
which satisfies (35) self-consistently. 

To find a self-consistent Yo(v) is not, in general, easy. 
However, one might use the results calculated here, for 
example, and re-insert them as a new Yo(v) in (35) and 
recalculate 2(v) to find a second iteration, which might 
provide an even better impedance with which to re- 
calculate again if necessary. Aside from the great 
amount of work involved and questions of convergence 
of the procedure, we cannot even be sure if a substantial 
improvement would result; however, the following ob- 
servations do suggest that a self-consistent solution of 
(35) could result in a considerable increase in accuracy. 

In the variational principle of I, one can try a trial 
action functional Sq which is more general than the two 
parameters one employed there and which describes an 
electron coupled to a general linear system, 


1 fdX@y?_ 1 
- ( dt ) art J h(t—s)x(t)x(s)dids. (62) 


Then, putting h(t)= fh(v)e*'dy/27, one finds that the 
function h(v) which gives the lowest energy in the 
variational principal at zero temperature, satisfies the 
integral equation 


2 
h(z) = J [ Re1cet-e*x1—cosun) 
3(2r)3 
Kee Ke“ He dr, (63) 
where 
n= 21 T . (64 
g(r) i (1—cosp Sih) on (64) 


The Eqs. (35) treated as a self-consistent set with 
D(u) generated from Y(v)=[»?+x(v)]"' can be trans- 
formed for 8=0 at least, to exactly this same pair of 
Eqs..(63) and (64). In this case x(—iu) replaces h(x). 
We therefore, can conclude (for 8=0) that if one tries as 
a trial action functional that for an electron coupled to 
a general linear system, no such system will produce a 
better result than one which has an impedance which is 
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the self-consistent solution of (35). (Possibly tle same 
is true for arbitrary 8, but we have not checked this 
point.) These considerations substantiate further the 
interpretation of the expansion (38) in terms of an 
impedance rather than an admittance. 

The above point, and the existence of a minimum 
principle for the Boltzmann equation, suggest that some 
minimum principle exists for the mobility x(% ju) in 
quantum mechanics at arbitrary f. 

Another way to improve accuracy is to try to include 
the next term (®—4,)? in the expansion. The integrals 
to be performed still are of the form required for the 
evaluation of the (6—@,) term but with more com- 
plicated driving forces E and E’. Although the calcula- 
tion would proceed straightforwardly it would, in 
general, be very laborious. However, for certain values 
of py and £, one could do what amounts to the same thing 
in a somewhat easier way, For example, for B-!<v<1 
the various terms in (50) could be improved by calcu- 
lating the appropriate cross sections more accurately. 
The cross section to absorb a quantum from the electric 
field and emit a single phonon requires matrix elements 
of quantities like x, exp(iK- X,), Equation (50) corre- 
sponds to calculating these with the propagator 
exp(i®p), but an improvement can be made by calcu- 
lating them with the propagator 


(1+2(b— &,)] exp(i&o). 


For zero frequency the Boltzmann equation can be 
used with the rates y(P’—> P) tatculated with the 
propagator exp(#®o); further improvement would again 
result by adding a correction for the difference of ® 
and &o, 
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APPENDIX A 


In this appendix we discuss a fundamental path 
integral (or trace) such as (7) for a single one-dimen- 
sional oscillator, in terms of which all other path 
integrals can be immediately evaluated. 

The oscillator Hamiltonian H= p2/2+w%¢2/2, Let 


+00 
U=e| a [etna 


eo 
ee expi (H ‘a(t aay 
and pa=e~8#/Q. The Tr(Up.U'")=g may be done in 
several ways, for example by writing it in terms of path 


* One probably cannot get x(v) from (63) and (64) by iteration 
for one finds x(—tx) only approximately this way and one cannot 
pass to an accurate value on the real line from an imperfect 
knowledge on the negative imaginary axis alone. 
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integrals and performing the resulting Gaussian inte- 
grals.!2 As an alternative method we choose to represent 
the trace in free oscillator eigenfunctions. In this repre- 
sentation (pa)n.n=e7"8*(1—e~#*) and 

8= Dimin GmnGmn® e Fe" 1 —eF2), (Al) 


where Gm,n is given in reference (9), Eq. (38) and 
Gm,n' is the same expression with y’ replaced by y. 
By expanding 


expL(x+ a6) (y+ 46") ]= Diet ib) y+ ie) Ht! 


in powers of x and y, one can show that Gn, may also 


be written as 
EN(Zk) rman 


Gin n= Go.o(me int !y71/ be° , (A2 
EN ee a a 
where 
4-00 
ge aye f efty(t)dt. (A3) 


The summation over m and # in our expression for g can 
be done by the binomial theorem if one uses this ex- 
pression for Gm,n; but the one in reference (4), Eq. (38), 
for Gn’, calling t=r-+2, where r is the free index in 
Gm,n', permits summing first over r then over v. Tlie 
final result is 


B= GooGen'ett* 
X exp[(1—e-#4) “H(t —68") GE" —ig'Fe-4) 


Substituting in for Goo and Goo’, we find that 


(A4) 


+00 }00 
= exo # / Ly)" OUEr(s) +7()] 


Seta) Sena: 1—s)tis) (AS) 


where 


y(w, t—s)=(1/w) sinw(t—s), t>s 
=0, i<s ) 
and 
a(w, t—s)=(1/2w) cosw(¢—s)[14+-2P(Bw)]. (A7) 


If y(v), y’(), y(v), a(v) are the Fourier transforms of 
y(t), v'(t), etc., so that, for example, 


y(t) = | voyerinprr, (A8) 


then the expression for g can be written in Fourier 
space as 


Z +00 
z= exo( — | b-P2vEniGOn Ol 
Xa(op) +iL10)—1)Jo(wo)} ar), (A9) 
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y(a,v) = 1/(v—te)?— w? (A10) 
and 
a(w,v) = (r/2w) [1+ 2P(Bw) [8(vt+-w)+6(p»—w)]. (All) 


For the case of the Hamiltonian (11), we can represent 
the particle motion by a path integral on X(t). Then we 
have in fact a large number of independent oscillators, 
each coupled to the particle. The K mode of frequency 
wx is coupled by a y(t)=Cxexp[iK: X()]. Each of 
these modes contributes a factor like (A9) to g so that 
the final exponent is a sum of contributions from each 
oscillator mode. 

We shall need the functions (A10), (A11) and super- 
positions of them (sums for various frequencies) which 
we call Y(v), A(v). We shall also need another function 
D(u) defined as 


Sad 
Dadar f {sin(vu) ¥(v)+[1— cos(vu) JA (v)} dy, 

m (A12) 
and A(z) defined as D(u+78/2). All these functions are 
related to Y(v), in fact to its imaginary part ImY(v). 
We need it only for »y>0, since ImY(—v»)=~—ImY(). 
For a single oscillator, from (A5) as we have Imy(v) 
= — (1/2w)[6(v—~-w) — 5(v+u)]. But a(v) can also be 


written as 


2w 


aw fert+l e Prt} 
( coe ss) 

efr— J eor— 1 
hence, a(v)= — coth(6r/2) Imy(v). Further, since the 
poles of a general Y(v) lie above the real axis, the real 
part of Y(v) can be obtained from the imaginary part. 
Proceeding in this way, we find the following expressions 
for all the functions in terms of ImY(): 


2 7” ImY(u)udu 


Y@)=- (A13) 
ajo (v—ie)?—p? 
efr— 1 
A(vy) = — ( ) ImY(), (A41) 
ert 
2/7 2(1—cosvu) 
DQi)= =f (1-4) ImY(v)dv, (A15) 
WJo ef — 1 
and 
A(u)= a Eeseney/2) cose) ImY(v)dv. (A116) 


T sinh(6v/2) 


Although derived for a single oscillator, these relations 
are linear and hold for any superposition of oscillators. 
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APPENDIX B 


We give here the details of the calculation of the first 
order correction to G;. As explained in the text, we can 
calculate an expression like (28) by substituting (30) 
and (31) into (20). If we write 


v 


i 
geno] = J rei(ce Vert indee a | 
for the expression (20) calculated for K=0, the result 
(20) with K included as in (30) and (31) is 
R(K,t,8) 


tK? 


=etexp|— f Jere" (Vet idayi 
4 


1K, . + 
ee [treo ee Vet inde ae 
is 


iK, 
7 i (e-#t me )nct Vor iAelar}. (B1) 
T 


We shall ultimately only need the result to the first 
order in en [see (9)], so differentiating R with respect 
to 7 and e¢, putting e=7=0 and calling the result 
2r(K,t,s), we get (there is a 1/3 for averaging over 
directions of K)'# 


i 
7(K,t,s) = {= | voermoas 
2dr 


Kk? 


fe e”*)e-#°V ody 
3(8r’) 


x ie (eist— eis") iA g(u)e 
+(e -oneeCY n+ idan) Ha} 


1K? 
Xeno] — | Jebt—ebe Petia | (B2) 


The first term is Go(r—«) times the path integral with 
e=7=0.Such a term arises no matter what we integrate, 
so in total it gives Go(r—a) fe**(@— Ho) D XD X’. This 
term just cancels when we remember that we must 
divide (0°g/dnd«) by g evaluated at e= 7=0 for normali- 
zation. Therefore, this does not contribute to G1, the 
first correction of G from Go, and we omit it. The other 
terms are later to be multiplied by a function of i—s=u 
only, and integrated on ¢ and s. Hence, we let t=u+s 
and integrate on all s’s to get [note Yo(—v)=Yo*(r), 
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A (—v)= Aol») ] 


K? 
7(K,t—s)= i [2a-cosar) 


7 


C¥o(v) +274 o(v) Je* nas] 
1K? 
x exp] — | a- cos) (Vor ia (B3) 


This will make a contribution to G,(r—c). It is already 
in the form of a Fourier transform so for the contribu- 
tion to G;(v) we omit the integral on » and the factor 
e”"-"), According to (29) we must next multiply 
7r(K, t—s) by y(wx,u)+te(wx,u) and integrate on wu. 
This is best done by dividing the range of u from 0 to © 
and from —© to 0, and in the latter putting «— — 
so that all integrals are over positive « only. 
The integral in the exponent is — K?/2 times 


i 


[o —cosvu)[Y o(v) +74 o(v) |dv 


v 


= +¢[Yo(u)+Yo(—) J+-2[A0(0)—Ao(z)] (B4) 
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(where Y,(#), Ao(#) are the inverse transforms of Yo(v), 
Ao(v)]. Expression (46) is equal to D(#), defined in 
(35c) for u>0, and D(—u) for u<0, since Yo(u)=0 for 
u<0. Thus, this term in r(K,t—s) contributes a piece 


K 2 
5 Vo) V ole) +244 ov) ] 


xf (1—cosvu)[y(wx,u)+y(wx, — 2). 
° 4 Dia(wg,t) Je# du. (BS) 


Adding the three other corresponding pieces from 
exp{iK-(X’()— X(s)]}, etc., multiplying by [Cx], 
and integrating over K [see Eq. (30)] gives the first 
term in (35). The second term is gotten in an analagous 
way from ®o. We need to expand our expression for 
7(K,t—s) just to first order in K*. The terms like 
e~}®*D(“) are replaced by one. The resulting expression 
is an integral on %, fo°(1—e**)So(u)du, where 


2 coswu' 


So(u) = Ic etm \= C sinwu. 


efu—] 


The integral on u gives Cv?/w(v?—w?), as in the last 
term of (34). 
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The Theory of a General Quantum System Interacting with 
a Linear Dissipative System 


R. P. Feynman 


California Institute of Technology, Pasadena, California 
AND 


F. L. VERNON, JR.* 


Aerospace Corporation, El Segundo, California 


A formalism has been developed, using Feynman’s space-time formulation of 
nonrelativistie quantum meehanies whereby the behavior of a system of in- 
terest, whieh is eoupled to other external quantum systems, may be ealeulated 
in terms of its own variables only. It is shown that the effeet of the external 
systems in sueh a formalism ean always be ineluded in a general elass of fune- 
tionnls (influenee funetionals) of the coordinates of the system only. The prop- 
erties of influenee funetionals for general systems are examined. Then, specifie 
forms of influenee funetionals representing the effeet of definite and random 
elzssicnl forees, linear dissipative systems at finite temperatures, and eombina- 
tions of these are analyzed in detail. The linear system analysis is first done for 
perfectly linear systems eomposed of eombinations of harmonie oseillators, loss 
being introdueed by eontinuous distributions of oseillators. Then approxi- 
inetely linear systems 2nd restrietions necessary for the linear behavior are 
considered. Iufluenee funetionals for all linear systems are shown to have the 
sume form in terms of their elassieal response funetions. In addition, a flue- 
tuation-dissipation theorem is derived relating temperature and dissipation of 
the linear system to a fluetuating elassieal potential aeting on the system of 
interest whieli reduees to the Nyquist-Johnson relation for noise in the ease of 
eleetrie cireuits. Semple ezleulations of transition probabilities for the spon- 
tzneous emission of an atom in free spaee and in a eavity are made. Finally, a 
theorem is proved showing that within the requirements of linearity all sourees 
of noise or quantum fluetuation introdueed by maser-type amplifieation deviees 
nre xeeounted for by a elassieel ealeulation of the eharaeteristies of the maser. 


I. INTRODUCTION 


Many situations occur in quantum mechanics in which several systems are 
coupled together but one or more of them are not of primary interest. Problems 
* This report is based on 2 portion of a thesis submitted by F. L. Vernon, Jr. in partial 


fulfillment of the requirements for the degree of Doetor of Philosophy at the California 
Institute of Teelinology, 1959. 
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= 


Fig. 1. General quantum systems Q and X eoupled by a potential V(Q, X, #) 


in the theory of measurement and in statistical mechanics present good examples 
of such situations. Suppose, for instance, that the quantum behavior of a system 
is to be investigated when it is coupled to one or more measuring instruments. 
The instruments in themselves are not of primary interest. However, their 
effects are those of perturbing the characteristics of the system being observed. 
A more concrete example is the case of an atom in an excited state which inter- 
acts with the electromagnetic field in a lossy cavity resonator. Because of the 
coupling there will be energy exchange between the field and the atom until 
equilibrium is reached. If, however, the atom were not coupled to any external 
disturbances, it would simply remain unperturbed in its original excited state. 
The cavity field, although not of central interest to us, influences the behavior 
of the atom. 

To make the discussion more definite, let us suppose there are two nonrela- 
tivistic quantum systems whose coordinates are represented in a general way 
by Q and X, as in Fig. 1, coupled together through some interaction potential 
which is a function of the parameters of the two systems. It is desired to compute 
the expectation value of an observable which is a function of the Q variables 
only. As is well known, the complete problem can be analyzed by taking the 
Hamiltonian of the complete system, forming the wave equation as follows: 


[H(Q) + HX) + VQ, 010, ¥) = -(4) 2 ¥@,»), 


and then finding its solution. In general, this is an extremely difficult problem. 
In addition, when this approach is used, it is not easy to see how to eliminate the 
coordinates of X aud include its effect in an equivalent way when making com- 
putations on Q. A satisfactory method of formulating such problems as this in a 
general way was made available by the introduction of the Lagrangian formula- 
tion of quantum mechanics by Icynman. He applied the techniques afforded 
by this method extensively to studies in quantum electrodynamics. Thus, in a 
problem where several charged particles interact through the electromagnetic 
field, he found that it was possible to eliminate the coordinates of the field and 
recast the problem in terms of the coordinates of the particles alone. The effect of 
the field was included as a delayed interaction between the particles (1, 2). 
The central problem of this study is to develop a general formalism for finding 
all of the quantum effects of an environmental system (the interaction system) 
upon a system of interest (the test system), to investigate the properties of this 
formalism, and te draw conclusions about the quantum effects of specific 
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interaction systems on the test system. Cases where the interaction system is 
composed of various combinations of linear systems and classical forces will be 
considered in detail. For the case in which the interaction system is linear, it 
will be found that parameters such as impedance, which characterize its classical 
behavior, are also important in determining its quantum effect on the observed 
system. Since this linear system may include dissipation, the results have ap- 
plication in a study of irreversible statistical mechanics. 

In Section II, after a brief discussion of the Lagrangian formulation of quantum 
mechanics, a general formulation of the problem is made and certain functionals, 
called influence functionals, will be defined, which contain the effect of the in- 
teraction system (such as system X in Fig. 1) on the test system in terms of the 
coordinates of the test system only. General properties of these functionals will 
be derived and their relationship to statistical mechanics will be discussed. To 
obtain more specific information about the properties of the formalism, we then 
specialize the discussion to cases where well-defined systems are involved. In 
Section III, the special cases are considered in which the interaction system is a 
definite classical force and a random classical force. In Section IV, the influence 
functionals for exactly linear systems at zero temperatures are derived and then 
extended to the case that the linear systems are driven by classical forces. In 
addition, the effect of finite temperatures of linear systems is considered. Then, 
in Section V, the unobserved systems are again assumed to be general but weakly 
coupled to the observed system. Within the approximation of weak coupling 
these general systems also behave as if they were linear. Then finally in Section 
VI, the results of the analysis are used to prove a general result, concerning maser 
noise. 

It is to be emphasized that although we shall talk of general test and interac- 
tion systems, the Lagrangian formulation is restricted to cases involving mo- 
mentum or coordinate operators. Therefore, strictly speaking, systems in which 
the spin is of importance are not covered by this analysis. However, this has no 
bearing on the results since their nature is such that their extension to the case 
where spins are important can be inferred. 

An equivalent approach can be made to the problem using the Hamiltonian 
formulation of quantum mechanics by making use of the ordered operator calcu- 
lus developed by Feynman (3). This approach has been used to some extent by 
Fano (4) and has been developed further by Hellwarth (6).! Some advantages 
of this method are that many results may be obtained more simply than by the 
Lagrangian method and nonclassical concepts such as spin enter the formalism 
naturally. However, the physical significance of the functions being dealt with 
are often clearer in the Lagrangian method. 


1 Many of the results obtained in this work have also been obtained by him using ordered 
operator techniques. 
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II. GENERAL FORMULATION—INFLUENCE FUNCTIONAL 
A. LaGRANGIAN ForRMULATION OF QUANTUM MECHANICS 


We shall begin the discussion with a brief introduction to the Lagrangian or 
space-time approach to quantum mechanics and the formal way in which one 
may set up problems of many variables.’ Let us suppose that we are considering 
a single system which has coordinates that are denoted by Q, and that for the 
time being it is not acted on by any other quantum system. It can be acted on 
by outside forces, however. The system may be very complicated, in which case 
Q represents all the coordinates in a very general way. If at a time ¢ the variable 
Q is denoted by Q;, then the amplitude for the system to go from position 
Q, att = 7 to Qr at t = T is given by 


K(Qr,T;Q,,7) = J exp [(t/h) S(Q)]DQ(t) (2.1) 


in integral which represents the sum over all possible paths Q(¢) in coordinate 
space from Q, to Qr of the functional exp [(z/A) S(Q)]. S(Q) = JS,” L(Q, Q, t) dt 
is the action calculated classically from the Lagrangian for the trajectory Q(t). 
For the case that Q is a single linear coordinate of position, this is represented in 
the diagram in Fig. 2. The magnitude of the amplitude for all paths is equal but 
the phase for each path is given by the classical action along that path in units 
of #. Thus, amplitudes for neighboring paths which have large phases tend to 
cancel. The paths which contribute the greatest amount are those whose ampli- 
tudes have stationary phases for small deviations around a certain path. This is 
the path for which the classical action is at an extremum and is, therefore, the 
classical path. Remarkably enough, for free particles and harmonic oscillators, 
the result of the path integration is 


K(Qr,Q,) = (Smooth Function) exp [(¢/%) Sc] 


where S,; is the action evaluated along the classical path between the two end 
points Q, , Qr . However, for more complicated systems this simple relation does 
not hold. A discussion of the methods of doing integrals of this type is not in- 
cluded here since methods appropriate for the purposes here are already con- 
tained in the literature (1, 2). 

Since K(Qr , Q,) is the amplitude to go from coordinate Q, to Qr, it follows 
that at t = T the amplitude that the system is in a state designated by ¢n(Qr) 
when initially in a state ¢,(Q,) is given by 


Amn = J m= (Qr)K (Qr , Qr)bn(Qr) dQr dQ, 
= J bm’ (Qr) exp [(¢/h)S(Q)]on(Q-)DQ(t) dQr dQ, 


2 For a more complete treatment, see ref. 1, 
3 In subsequent equations K (Qr , T; Qc, t) will be written K(Qr , Q:). 


(2.2) 
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Qit}— 


Fia. 2. Spaee-time diagram showing possible paths for partiele to proceed from Q, to Qr 


The probability of the transition from n — m is given by [Amn|’ and from Eq. 
(2.2) this can be written in the form of multiple integrals as follows: 


Pan = J bm" (Qr)dm(Q'r) exp {(¢/%)[S(Q) — S(Q’)} 
X on(Qr)obn” (Q's) DQ(t)DQ' (t) dQ, dQ’, dQr dQ'r 


As a example of a more complicated case let us consider two systems whose 
coordinates are Q and X.* The systems are coupled by a potential which can be 
designated as V(Q, X) and incorporated in the total Lagrangian. We assume 
that when V = 0 the states of Q and X can be described by sets of wave func- 
tions ¢(Q) aud xp(X) respectively. If, initially, Q is ina state @,(Q,) and X is 
in a state x,(X-,), then the amplitude that Q goes from state n to m while X 
goes from state 7 to f can be formed in a similar way to that of Eq. (2.2), 


A mini = S bm (Qr)xs"(Xr) exp [(2/h) S(Q, X) lon (Qr) xi (Xe) (2.4) 
X DQ(t)DX(t) dQ, dX,dQrdXr 


(2.3) 


where S(Q,X) represents the classical action of the entire system including 
both Q and X. The important property of separability afforded by writing the 
amplitude in this way is now apparent.’ For instance, if one wishes to know the 
effect that X has on Q when X undergoes a transition from state 7 to f, then all 


4 Each system will be denoted by the coordinates that characterize it. Where Q or X 
means specifically a coordinate, it will be so designated by a statement if it is not obvious. 

5 If system Q represents a harmonic oseillator and the interaction of Q with X were linear 
and of the form ~— y(t, X)Q(#), then that part of Eq. (2.4) which involves the Q variables 
eorresponds to the funetion Gm defined and used by Feynman to eliminate the eleetromag- 
netie field oseillators. See ref. 2. 
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of the integrals on the X variables may be done first. What is left is an expression 
for Amn for Q and in terms of Q variables only but with the effect of X included. 
The extension of writing transition amplitudes for large numbers of systems is 
obvious. In principle the order in which the variables are eliminated is always 
arbitrary. 


B. DEFINITION oF INFLUENCE I'UNCTIONAL 


A functional can now be defined which can be used to describe mathematically 
the effect of external quantum systems upon the behavior of a quantum system 
of interest.® 

The fundamental theorem for this work may be stated as follows: For any 
system, Q, acted on by external classical forces and quantum mechanical systems 
as discussed above, the probability that it makes a transition from state ¥,(Q,) 
att = 7 tO Wm(Qr) at t = 7 can be written 


Pan = [ ¥m* (Qr)¥m(Q'r) exp {(2/h)[So(Q) — So(Q')}F(Q, Q') 
X vn* (Q's )¥n(Qr)DQ(1)DQ' (t) dQ, dQ’, dQr dQ'r 


where (Q, Q’) contains all the effects of the external influences on Q, and 
So(Q) = J? L(Q, Q, t) dt, the action of Q without external disturbance. The proof 
of this is straightforward. Let us examiue two coupled systems characterized by 
coordinates Q and X as represented diagrammatically in Fig. 1. Q will represent 
the test system and X the quite general interaction system, (excepting only the 
effects of spin) coupled by a general potential V(Q, X, ¢) to Q. Assume Q to be 
initially (¢ = 7) in state ¥,(Q,) and X to be in state x,(X,), a product state. 
The probability that Q is found in state y,,(Qr) while X is in state x;(X7r) at 
t = T can be written in the manner discussed above and is 


Prgni = [Ams nil’ 
= J Vm" (Qr)Wm(Q'r) xs" (Xr) xy(X'r) 
X exp {(¢/h)[S0(Q) — So(Q’) + SCX) — S(X’) + S,(Q, X) 
— §(Q,X')} 


(2.5) 


X vn" (Qe) Wn (Qe) x0" (X's) xs(X,) dX, 
++ dQ’ rDX'(t) «++ DQ(t) 


The primed variables were introduced when the integrals for each A my.nz Were 
combined. Now if all of Eq. (2.6) which involves coordinates other than Q or 


5 Hereafter, the system of interest will be referred to as the test system. Conversely, the 
system not of primary interest will be called the interaction or environmental system. 
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Q’ is separated out and designated as 5( Q, Q’), then the following expression is 
obtained 


5(Q, Q') = Sx" (Xr) x(X'r) 
-exp {(z/h)[S(X) — SCX’) + S,(Q, X) — S81(Q’, X)]}} (2.7) 
X xi" (X",)x(X,) dX, dX'DX (t) DX" (t) 


Incorporation of this expression into Eq. (2.6) yields the desired form of Eq. 
(2.5). If the path integrals are written in terms of kernels, Eq. (2.7) becomes 


5(Q, Q) = Sxy"(Xr)xy(X'r)K (Xr, X-)K4.(X' 2, X’,) 
x x" (X,)xi(X,) dX, +++ dX" r 


where the subscript Q means that the kernel includes the effect of a potential 
V(Q, X) acting on X during the interval T > i> 7. As can be seen, & is a 
functional whose form depends upon the physical system X, the initial and final 
states of X, and the coupling between Q and X. 

It is to be emphasized that the formulation of ¥ is such that it includes all the 
effects of the interaction system in influencing the behavior of the test system. 
Thus, if there are two systems A and B which can act on Q, and if 


(2.8) 


F40nQ = FBong, 


then the effects of A on Q are the same as those of B on Q. It follows that if 
simplifying assumptions are necessary in finding F4 on g and Fz on g (due to the 
complicated nature of A and B) and if the resulting functions are equal, then 
within the approximations the effects of A and B on the test system are the same. 
In the situation where the interaction system is composed of a linear system or 
combinations of linear systems we shall see that the same form of & is always 
appropriate. To adapt this general form of § to a particular linear system it is 
only necessary to know such quantities as impedance and temperature which 
determine its classical behavior. In still other situations, very weak coupling 
between systems is involved. The approximate $ which can be used in this case to 
represent the effect of the interaction systems has a form which is independent 
of the nature of the interaction system. This form is the same as for linear 
systems. These cases will be considered in more detail in later sections. 


C. GENERAL PROPERTIES OF INFLUENCE I*UNCTIONALS 


There are several general properties of influence functionals which are of 
interest and which will be useful in subsequent arguments. The first three of 
these (1, 2, 3) follow directly from the definition of F(Q, Q’). The last two (4, 5) 
will require more discussion. 
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1. If the physical situation is unsure (as for instance if the type of interaction 
system X, or the initial or final states are not known precisely) but if the prob- 
ability of the pth situation is w, and the corresponding influence functional is 
¥, , then the effective $ is given by 


Serr = ae WpFp = (F) (2.9) 


Thus, in Eq. (2.6) if the initial state of X were not certain but the probability 
of each initial state were w;, then Pm, for system Q would be given by 
> WP my ni . Since the summation involves only the part of Eq. (2.6) involving 
the X variables, it is a sum over the influence functions for each possible initial 
state and results in an average influence functional of the type given above. 

2. If a number of statistically and dynamically independent partial systems 
act on Q at the same time and if ¢“ is the influence of the kth system alone, 
the total influence of all is given by the product of the individual ¢: 


¢= [Jtis” (2.10) 


Again referrmg to Eq. (2.6), if there were N subsystems interacting with Q, 
then the probability that Q makes a transition from state n to m while each of 
the subsystems makes a transition from its initial to its final state is given by 
an expression of the same form as Eq. (2.6). The difference in this case being 
that the term involving the X variables would be replaced by a product of N 
similar terms—one for each subsystem. Thus, when the term involving all the 
X variables is separated out the complete influence functional is recognized 
as a product of the functionals ¢(Q, Q’) for each subsystem. 

DEFINITION: In many cases it will be convenient to write ¥ in the 
form exp [7@(Q, Q’)]. ® is then called the influence phase. For independent dis- 
turbances as considered in 2, the influence phases add. In the event that 76(Q, Q’) 
is a real number we will continue to use the notation ®; the phase simply becomes 
imaginary. It will frequently be more convenient to work with ® rather than $. 

3. The influence functio.wal has the property that 


$*(Q, Q’) = £(Q’, Q) (2.11) 


Referring to Eq. (2.7), the definition of the influence functional, this fact follows 
immediately upon interchanging Q and Q’. 

4. In the class of problems in which the final state of the interaction system 
is arbitrary, which means the final states are to be summed over, then ¥(Q, Q’) 
is independent of Q(t) if Q(t) = Q’(t) for all t. All of the problems we will be 
concerned with here are of this type. 

The validity of this statement can be ascertained by observing Eq. (2.7), the 
general definition of the influence functional. In particular, for the case where the 
initial and final states of the interaction system X are z and f respectively, as in 
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Eq. (2.7), we denote the influence functional by ,5,(Q, Q’). Let us assume we 
have no interest in the final state of X which means that ,5:(Q, Q’) must be 
summed over all such states. The initial state 7 can be quite general. Thus, the 
influence functional for the case of an arbitrary final state is 


F:(Q, Q') = Dis 4F:(Q, Q) 


For clarity in finding the result of letting Q(t) = Q’(t) for all ¢ in F;(Q, Q’), we 
will write out the expression explicitly from Eq. (2.7). It is 


F:(Q,Q) = f Vy xs" (Xr) x/(X'r) 
-exp {(4/h)[S(X) — S(X") + 81(Q, X) — S,(Q, X’)} 
x xi" (X',) xi( Xr) dX, a DX’ (t) 


Since Q appears in the interaction potentials acting on the X and X’ variables 
respectively, it loses its identity as the coordinate of a quantum system and 
becomes just a number (which may be, of course, a function of time). Thus 
S1(Q, X) may be interpreted as the action of an external potential which drives 
the X system. The above expression then represents the probability that X, 
which is in state ¢ initially, is finally in any one of its possible states after being 
acted on by an external potential (as, for instance, in Eq. (2.3) summed over the 
final states, m). This result is unity. We have then that (Q,Q) = 1 and is 
independent of Q(t). 

5. A more restrictive statement of the property in the above paragraph (4) 
can be made. In this same class of problems in which the final states are summed 
over, if Q(t) = Q’(t) for allt > r then 5(Q, Q’) is independent of Q(t) fort > r. 
To see this we write down the influence functional from Eq. (2.8) breaking up 
the time interval into two parts, before and after r. Setting Q = Q’ fort > rand 
utilizing the closure relation for the sum over final states we have, 


5-(Q, Q') = J b(Xr — X'r)Ko(Xr, X,)Ko"(X'r, X's) 
x Ko(X,, X-)Ko:(X',, Xe) xi (Xe) xi( Xe) dX, + dX" 
Examining the parts of the above integral which contain the effects of ¢ > r: 
f6(Xe — X'n)Ke(Xr,X,)Ko"(X'r, X',) dX dX'r 
= f{ KQ(X’,, Xr)Ko(X7,X_) dXr 
= K(X’, X,) = 6(X, — X’,) 
The expression for 5,(Q, Q’) becomes then 
5,(Q, Q’) = f 6(X, — X’,)Ko(X,, X,)Ka(X,, Xs) 
x ¥"(X4)xi(Xp) AX, + aX", 
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which is independent of Q(t) for t > r. As will be seen later in the specific case 
of linear systems, this leads to a statement of causality. 


D. STATISTICAL MECHANICS 


Finally it is appropriate to point out explicitly the significance of the influence 
functional in a study of quantum statistical mechanics. In the class of problems 
considered here we are only interested in making measurements on the test 
system and not on the interacting system. Thus, when the expectation value of an 
operator which acts only on the test system variables is taken, the final states of 
the interaction system must be summed over. It is equivalent to taking the ex- 
pectation value of the desired operator in the test system and simultaneously 
the unit operator in the interaction system. Therefore, only the influence func- 
tional where the final states of the interaction system are summed over will be 
of interest to us. 

Starting with the coordinate representation of the density matrix (6) for the 
test and interaction systems, p(Q, X; Q’, X’), we will show the part played by 
the influence functional in obtaining an expression for p(Qr , Q’r), that is, with 
the X coordinates eliminated, in terms of its value at an earlier time 7, p(Q, , Q’,). 
First, we recall that the definition of p is as follows: 


0(Q, X;Q’, X’) = W(Q, X)¥*(Q', X’) av (2.12) 


where ¥(Q, X ) represents the wave function for one of the systems in an ensemble 
of systems each representing one of the possible states of the Q, X system (7). 
The average, represented by ( )av , is taken over the ensemble. The trace of the 
density matrix is 


Tr o(Q, X;Q', X") = Sf o(@, X; Q, X) dQax (2.13) 


and the expectation value of an operator A which operates on the Q variables 
only is 


(A) = SSS 0(Q, X;Q’, X)A(Q, Q’) dQ dQ’ dX (2.14) 
In the above 
A(Q',Q) = Yoi5 Aub.” (Q)6;(Q), 
* (2.15) 
Ai = fo: (Q)A¢@;(Q) dQ, 


and ¢;(Q) is one of a set of complete orthonormal eigenfunctions. From Eq. 
(2.14) then we see that the formal expression which we wish to derive is 


fp(Qr Xoi x, Xr) aX = o(Qr, Qn) (2.16) 
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in terms of p(Q, , Q’,). From the rules given in Section II.A for propagation of 
a wave function with time we can easily find pr in terms of p, . Thus 
o(Qr, Xr; Q'r, Xr) = J exp {(4/h)[So(Q) — So(Q’) 
+ S(X) — S(X’) + 8(Q, X) ~— 8:(Q’, X’)}} (2.17) 
x p(Q, ) XxX, ; Q’., X’,)DQ(t) aes dX’, 


Now, for simplicity let us assume that initially the two systems are independent 
so that 


0(Q,,X,;Q',, X's) = 0(Q,, Q')0(X,, X's) 
Then eliminating the Xr coordinate as indicated in Eq. (2.16) we have 
o(Qr, Q'r) = Sf {f (Xr — X’r) exp [(7/h)[S(X) — SCX’) 
+ 8(Q, X) — 81(Q', X’)]lo(X, , X',) 
x DX(t) «++ dX’,} exp [(¢/h)[S0(Q) — So(Q')]]o(Q, , Q'-)DQ(t) «++ dQ’, 


The expression inside the braces is identified as ¥(Q, Q’) for the case in which 
the final state of X is summed over. Therefore, the following result is obtained: 


p(Qr ’ Q'r) 
= f F(Q, Q’) exp [(2/A)[So(Q) — So(Q')]]o(Q, , Q'-)DQ(t) «++ dQ’, 


Thus, if the density matrix of the test system Q is represented by p(Q, , Q’,) 
at some initial instant 7, the density matrix p(Qr, Q'r) at some later time 7 
is given by Eq. (2.18). The entire influence of the interaction system is contained 


in (Q, Q’). 


E. Use oF INFLUENCE FUNCTIONALS 


(2.18) 


At this point we need to consider how influence functionals can be used in 
the analysis of a problem. For clarity the discussion will be specialized to a 
particular problem but the principle is valid more generally. Suppose we wish to 
know the probability that a test system Q makes a transition from an imitial 
state ¢:(Q,) exp [(— 2/h)Enr] to a final state ¢,,(Qr) exp [(— ¢/h) ExT] when 
coupled to an interaction system. The formal expression for this probability is, 
from Eq. (2.5), 


Pam = J bm (Qr)bm(Q 7) exp {(7/K)[So(Q) — So(Q)}F(Q, Q) 
x bn* (Q’; on (Qr) dQ, aes dQ’ (t) 


This is formally exact but except in special cases it cannot be evaluated exactly. 
Furthermore, to obtain any specific answers to the problem the characteristics 


(2.19) 
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of Q must be known as well as knowing the influence functional. However, by 
using perturbation theory we may find general expressions for transition prob- 
abilities to as many orders as desired. For example, if the interaction system is a 
linear system at zero temperature, we will find that %(Q,Q’) is of the 
form exp[?)(Q, Q’)]. The perturbation expansion is obtained by writing 
exp[i%o(Q, Q’)] in terms of a power series and evaluating the path integral 
corresponding to each term in the expansion. In many cases the coupling between 
Q and the interaction system is small enough that only a few terms in the expan- 
sion are necessary. In Appendix I the basic procedure for finding the perturbation 
expansion is demonstrated by finding the specific expression up to second order 
in the potentials involved for transition probability of a test system when acted 
on by a linear interaction system at zero temperature. Calculation of transition 
probabilities represent only one piece of information that one might desire to 
know about a test system. For instance, it is more usually desired to find the 
expectation value of an operator in the test system. To calculate this one needs 
to know the density matrix describing the test system when it is coupled to an 
interaction system. The exact expression for the required density matrix is 
given in Section II,D. Again in the general case, one runs into the difficulty of 
making an exact calculation and is forced to make calculations using perturba- 
tion theory. The same procedure of expanding the influence functional into a 
power series and performing the required path integrations yields useful per- 
turbation expressions. 


III. INFLUENCE FUNCTIONALS FOR CLASSICAL POTENTIALS 


In this section we will derive specific forms and properties of influence func- 
tionals for the effects of classical potentials on the test system. These represent 
the simplest form of influence functionals and their properties follow directly 
from the general properties obtained in the previous section. These forms will 
then be extended to the case where the classical potential represents Brownian 
noise. 


A. PROPERTIES OF INFLUENCE IF'UNCTIONALS FOR CLASSICAL POTENTIALS 


The first step is to find the influence functional for a definite classical potential 
acting on the test system, Q. If the potential energy term in the Lagrangian is 
of the form V (Q, ¢), then it can be ascertained readily by referring to the funda- 
mental definition of F(Q, Q’) that 


F(Q, Q') = exp{—(4/h)J,"1V(Q, t)—V(Q’, t)] dd} (3.1) 
or equivalently the influence phase is 


#(Q, V) = —(/A)S71V(Q, t)-V(Q, t)] de (3.2) 
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The next degree of complication is to have several potentials, >-.V:(Q, t) 
acting on Q simultaneously. However, since the sum of all these potentials repre- 
sents an equivalent potential, say V(Q, t) = >..V:(Q, t), then it is obvious that 
the total influence functional ¥(Q, Q’) is the product of the individual ,(Q, Q’). 
More specifically, 


5(Q, Q’) = exp{—(2/A)f,7[1V(Q, t)—-V(Q', 1)] dé} 
= exp f{—(2/A)S,"1Ve(Q, t)—Vi(Q’, t)] de} (3.3) 
= [[..(@, 2), 
or 
—(1/A) Def7(VA(Q, t)—Ve(Q’, t)] dt 
= 27.%(Q, Q) 


The same result follows directly from Section II.C.38 which gives the total 
influence functional for several statistically and dynamically independent systems 
acting on Q. The total influence functional for all the systems (in this case po- 
tentials) is the product of the functionals for the individual systems. 

Another property of the classical influence functionals is obtained by inspec- 
tion of Eq. (3.1). We notice that for any classical ¥(Q, Q’) if conditions are such 
that Q(t) = Q’(t), then 5(Q, Q’) = 1 and is independent of ¢ for all times that 
the two variables are equal. It follows that the influence phase is zero for this 
condition. 

Finally, from Section II.C.1 we find that if the potential is uncertain but the 
probability of each V,(Q, ¢) is w, then the average functional is given by 


(5(Q, Q’)) = Di-w, exp{ —(4/h)f.1V-(Q, t)—V-(Q', t)] de} 
Dow (Q, Q’) 


In the following Sections we will assume a probability distribution w, appropriate 
to Brownian noise and will be able to derive a specific form for the average 
influence functional. 


£(Q, Q') 


(3.4) 


(3.5) 


B. Speciric FuNcTIONALS FOR RANDOM POTENTIALS 


Let us now suppose that the potential has known form, V(Q), but unknown 
strength C(t) as a function of time so that the total potential is V(Q, t) = 
C(t)V(Q). The average influence functional for two cases involving this type of 
potential will be particularly useful in the discussion contained in Sections IV 
and V. These cases are: (1) when C(t) is characterized by any coupling strength 
(average magnitude of C) with a purely Gaussian distribution, and (2) when 
C(t) is composed of large number of very weak potentials (acting on the test 
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system simultaneously) whose distributions are stationary but not necessarily 
Gaussian. 


1. Gaussian Notse 


First, we consider the situation when C(t) is Gaussian noise with a power 
spectrum (vy) and a correlation function R(r) = (2/r)fi (vy) cos vr dy 
then (%) is given by 


(5) = (exp {(6/f) J."C(1V(Q)—V(Q')] dd) ae 
=exp{—A°S,"f/R(t — s)[V(Q:) — V(Qs)IIV(Q.) — V(Q’)] ds dt} 
Expressed in Fourier transform notation this becomes 
(F) = exp {—(mh)"fo"d(v) | V(Q)-V.(Q’) | dy} (3.7) 
where 
V(Q)—VQ') = S71V(Q)-V(Q')le "dt (3.8) 


Expressions of the type given in Eq. (3.6) are common for operations in which 
it is required to find the characteristic function, F(i¢) = (e"/‘’) for f( 7°) repre- 
sented by integrals of the form f(T) = pen A(T, t)x(t) dt where x(f) is a 
Gaussian process. The result will not be worked out here as it may be found in 
standard references (8).” The equivalent expression for F in terms of frequency 
components, Eq. (8.7), is obtained from Eq. (3.6) in a direct manner using the 
definitions for R(t) and Eq. (3.8). 


2. Brownian Noise 


The Gaussian behavior of Brownian noise, characterized by the typical 
Gaussian probability distribution, may be the result of the cumulative effects 
of many small statistically independent sources, none of which is truly Gaussian. 
How that comes about can be seen as follows. The effect of these small sowrces 
on a test system may be represented by an influence functional of the same form 
as that of Eq. (3.6) where now C(t) = >.%_, C.(t), N is a very large number, 
and the C(t) are independent random variables. Application of the central-limit 
theorem to this situation shows that the probability distribution appropriate 
to C(t) is asymptotically normal subject to the following conditions (9): 

(a) The average values, 


(Ci(t)) < % 


and 
Li = (| C(t) — (C.(t)) ) < @, 


7 See, for example, pp. 372-373 where it is shown that the characteristie tab Fie) 
appropriate to the integral given above for a Gaussian process x(t) is exp[— Wei: 
A(T, A(T, s)Kz(t, s) ds dt] where the covariance K,(t,s) = (rex.) is the correlation fur etion 
eorresponding to R(t — s) in Eq. (8.6). 
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(b) The absolute moments 


misses = ([CAt) — (Cx(t)) |") 


exist for some 6 > 0, and 
(c) Making use of the definition 


N 
i= aes Bil, 


then 


ling, 
nsw (pp) +8? 


The condition of independence on the large number of variables and the finite 
average values required by (a) and (b) above assures that no one component 
dominates the total distribution. Condition (c) is sufficient to ensure that all 
higher order correction terms tending to deviate from a normal distribution 
vanish in the limit of large N. It should be recognized that if the C; possesses 
finite third moments y:,3 the correction terms arising from these moments de- 
crease as N~"”, However, for the cases in which we are interested, the number of 
the component forces C; is essentially infinite and higher order terms are negligi- 
ble. 


IV. INFLUENCE FUNCTIONALS FOR LINEAR SYSTEMS 


Linear systems are of considerable interest both because of the large number 
of situations in which they are involved and because they are amenable to exact 
ralculation. In this section the influence functional for arbitrary combinations 
yf oscillators will be found by direct extension of the analysis of a single oscillator. 
All linear systems which are lossless and those which contain certain kinds of loss 
can be represented by distributions of oscillators. Situations in which dissipa- 
tion arises from sources other than distributions of perfect oscillators will be 
covered in Section V. The same conclusions apply for all linear systems, however, 
as will be discussed subsequently. For clarity, we will restrict our attention 
initially to linear interaction systems at zero temperature and not acted on by 
classical forces. The effects of finite temperature and forces can then be included 
so that their significance is more apparent. 


A. ZERO TEMPERATURE LINEAR SysSTEM 


The result to be proven involves the assumption that the interaction system 
(X) is linearly coupled to the test system ((Q). The total Lagrangian for the system 
is 


Leotat = 1(Q, Q, t) + LX, X, t) + L1(Q, X) (4.1) 
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where L; = yQX, and L(X, X, t) is the part of the Lagrangian involving the 
X system above. The situation is the same as that shown in lig. 1 except the 
interaction poteutial is given by V(Q, X, t) = —yQX and the X system is 
linear. Our fundamental theorem for linear systems is as follows: 

The influence phase for the effect of X on Q can be written as follows: 


rn 1 pe PQ'(QQ.—Q4) , (9 — Q',) 
wOd) = 55 [ | Gl enee | ae Ge) 
$)(Q, Q’) is found by studying the properties of X alone.* Q, ts the Fourier trans- 
form of y(t)Q(t) and Z, ts a classical impedance function which relates the reaction 
of X to an applied force. Z, is found by taking the classical system corresponding 
to X (that is, whose Lagrangian is L(X, X, t)) and finding the response of the 
coordinate X to a driving force J(t) which ts derived from the potential —f(t)X (t)- 
S(t) is considered to be applied at T = 0 subject to the initial conditions that X (0) = 
X(0) = 0. Z, is defined by the expression 


v = Jo/(tX,) (4.3) 


where 
fee Sa fe dt and X= fi Kae" dl 
Tn the tinile domain, Eq. (4.2) can be expressed as 
#(Q, Q)) = —(1/2h) f2n Se vere(Qe — Q's) 
-(Q.F*(t —s)- QF (t — s)] ds dt 
where Im F(t), which we will call B(t) is, for t > 0, the classical response of X 
to a force s(t) = 6(t). Re F(t), which for this zero temperature case we call 
A,(t), is the correlation function for the zero point fluctuation of the variable X, 


a point discussed at more length below. The relations connecting these 
quautities are then, 


(4.4) 


F(t) = Ao(t) + tB(t) 


t, 22 Ee i (4.5a) 
=, 7 [ B(t)e7" at 
and the inverse relations 
Ao(t) = ae 1m (3 ) cos vt dv 
Ww Jo why 
(4.5b) 
2° 1 . 
B(t) = nm (, ) sin vt dv 
TT Jo 4 Ly 


’ More generally, the part of the interaetion represented by Q could be represented by 
funetion of Q sueh as V(Q). In this ease Q in the influence phase would be replaeed by V,(Q), 
the Fourier transform of V(Q(t)). 
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Ao(t) and B(t) are related as follows: 


Ao(t) = -{ eee ds (4.5¢) 


These relationships may be written in many forms. Two additional forms are 


2i ” 1 
F(t) = =f (3) cos vt dv 


few F(j t\e"at 


and 
2i(tvZ,) * for » > 0 
2¢(—ivZ_,) forv <0 


(4.5d) 


All the poles of 1/z7»Z, have positive imaginary parts and this impedance func- 
tion has the additional property that 


(1/irZ,) = (1/—trZ_,)* 


In the case of finite temperatures, the influence phase can be written in the same 
form as Eq. (4.4) except that ReF(t) = A(t), that is, without the subscript 
0, and a more general relation exists connecting A(t) and Im (1/z»Z,) (see 
Section IV,C). 


5(Q, Q’) for Single Lossless Harmonic Oscillator 


To prove the above theorem, we consider first a test system, Q, which is 
coupled to a simple harmonic oscillator whose mass is m, characteristic frequency 
w, and displacement coordinate X. The complete Lagrangian for X and Q can 
be written 


Lrotat = Lo(Q, Q, t) + 44mX? — lome’X? + QX (4.6) 
and the total action is written similarly.” 
Stott = So(Q) + Jr (dgmX* — lomuX? + QX) dt 


If X is assumed to be initially in the ground state (corresponding to zero tem- 
perature) then to within a normalizing constant x;(X) = ¢7°*"*, The final 
state of X is assumed to be arbitrary which means the final states are to be 
summed over. Therefore, in Eq. (2.7), the definite state x,;*(Xr)x;(X'r) 
will be replaced by the sum >_,@,"(Xr)®,(X r) = 8(Xr — Xr). The ®,(X) 
represent the energy eigenfunctions of the harmonic oscillator. With this in- 


9 The interaction Lagrangian QX could be written more transparently as yQX where Q 
and X are the coordinates of the system involved and + is a coupling factor which may or 
may not be a function of time. For simplicity in writing the lengthy expressions to follow, 
y has been incorporated into an effective coordinate Q since no loss in generality results. 
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formation available the influence functional is completely defined and from 
Section II.B can be written 


$o(Q, Q’) = f8(Xr — Xr) Ke(Xr, X) K(X, Xs) 
xX exp [—(mw/2%)(XP+X))]dX, «+. dX's 


where the subscript Q, Q’ refer to the interaction potentials —QX and —Q’X’ 
acting on the X and X’ systems respectively. The subscript 0 on $(Q, Q’) in- 
dicates zero temperature. lor the harmonic oscillator 


Ko(Xr, X,) = Nexp {(2/A)[S(X) — S1(Q, X)etassiens 
= Nexp {[tw/2%sin w(T — 7r)][(Xr + X7) cos w(7' — 7) 
— 2X-X, + (2Xr/w) J,” Q sin w(t — r)dt + (2X,/w) 
- f,"Q:sinw(T — t) dt — (2/w") J,” f,'Q.Q.sin w(T — t) sinw(s — 7) dsdi} 
where N is a normalizing factor depending only on w and the time interval 
T — 7. Thus, Eq. (4.7a) represents a Gaussian integral over the four X variables 


since S is itself quadratic in the X variables. When the integrals are carried out 
the following result is obtained for the influence phase: 


io(Q, Q’) = —(2hmw) Sf,” fr (Qe — Qs) 
x (e0"? —- hee 3 ds dt 


Thus, F(t — s) in Eq. (4.4) corresponds in this case to e**"/mw and from 
the definition given above B(t — s) = (1/mw) sin w(t — s)."' Rewriting Eq. 
(4.8) in transform notation we have 


)(Q,Q’) = (20h) i { QQ — Qv) _ a 2.0. 0} dv (4.9) 


(4.74) 


(4.7b) 


(4.8) 


—m|(vy — te)? — w] — —ml[(v + ze)? — | 
where 
QQ = p pee Qe” dt 


The function {—m[(v — te)? — w}}* = (mw) So” sin wte ” dt corresponds 
to 1/vZ, of Eq. (4.2). 


1 See ref. 2, Section 3. 

The finite time interval indicated by the limits T and 7 can be interpreted as turning 
the coupling (between Q and X) on at ¢ = rand off at t = T. However, since the interaction 
system is to be considered in most cases as part of the steady-state environment of Q, it is 
really more meaningful to extend these limits over an infinite range of time (r — —, 
T — +). The possibility of allowing X to interact with Q over a finite range of time can 
be taken care of by giving the coupling factor (already included in the variable Q) the 
proper time dependence. 

12 ¢ which occurs in zvZ, is a convergence factor which was inserted in taking the Fourier 
transform (1(t)/me) sin wt where 1(t) is the unit step function and is kept to show the loca- 
tion of the poles with respect to the vy axis when doing integrations of the type fy H(v) 
[tvZ,]-! dv. 


136 FEYNMAN AND VERNON 


Having obtained the expression for the influence phase we now turn to the 
classical problem of finding the response of X(¢) to a driving force, f(t), applied 
at t = O with the initial conditions X(0) = X(0) = 0. Starting with the La- 
grangian of the unperturbed oscillator from Eq. (4.6), we add to it a potential 
term —/(t)X(t). This potential has the same form as the coupling potential 
—QX used in the quantum calculation. However, it is to be emphasized that 
the response of X to a force has nothing to do with the system Q outside of the 
type of coupling involved; therefore, f(t) will symbolize the force in the classical 
problem. The complete Lagrangian is 


L(X, X, t) = WmX* — ema X” + fx (4.10) 
and the equation of motion derived from it is, 
mX + mu NX =f (4.11) 
Its solution under the initial conditions stated above is 
X(t) = (mw) 'fo' f(s) sin w(t — s) ds (4.12) 
or alternatively, in terms of Fourier trausforins, is 
X, = ff—ml(v — #e)® — (4.13) 


Therefore, B(¢ — s) in this case is a Green’s function which yields the response 
of X(t) to an impulse force f(s) = 6(s) and its transform yields 1/¢vZ, . Thus a 
classical calculation of the ratio X,//, under qttiescert initial conditions yields 
the proper function for 1/7»Z, . 


Distribution of Oscillators—Representation of Loss 


The results of the preceding section are easily extended to the situation where 
the interaction system is a distribution of oscillators. First, we cousider the case 
of independent oscillators coupled to the test system. It is assumed that there is a 
distribution of oscillators such that G(Q) dQ is the weight of oscillators whose 
natural frequency is in the range between Q and 2 + dQ. More specifically, 
G(Q) dQ is the product of the number of oscillators and the square of their coupling 
constants divided by the mass in dQ. Thus, we have a situation represented by 
the diagram of lig. 3. Each oscillator is assumed to be initially in the ground 
state and finally in an arbitrary state; the coupling is again assumed to be linear. 
The total action is then given by 


SIQ, X(2)] = So(Q) + J,” fo? G(2) SX" — Wg X*Q* + QX] dQdt (4.14) 


For the general properties of influence functionals already described we know 
that when independent disturbances act on Q the influence functional is a product 
of the ones for each individual disturbance. Since $,(Q, Q’) = exp [740(Q, Q’)] 
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Pe KE 7 | we 


Fro, 3. Test system Q coupled to a distribution of oscillators 


for the case of a single oscillator, the total influence phase for the distribution is 
the sum of the individual phases, 


(Q,Q') = So" G(2)ho.0(Q, Q') aa (4.15) 


More explicitly, 


@, = (2rh)* [ G(Q) dQ 


- ae i ; : (4.16) 
f { QQ = Qs), QQ, = Q) a 
Oe SOE te Se)” te) | 
l’or this case then, the form of Eq. (4.2) is obtained if we put 
(trZ,)* = lim fo” G(Q)[(v — ze)? — OY" da (4.17) 
or 
(Z,)* = (r/2)G(v) — tvfo? G(Q)(? — )* aa (4.18) 


Thus the effects of all the oscillators are included in the influence phase through 
the expression for Z,, Eq. (4.17). Now, however, because of the continuous 
distribution of oscillators, Z, has a finite real part. We will now show that this 
real part represents dissipation by arriving at the same impedance function 
classically. 

As before, we take the part of the Lagrangian from Eq. (4.14) having to do 
with the oscillators, except that the coupling potential — Q(t) fo” G(Q)X(Q, t) da 
is replaced by —/(t) fo’ G(@)X(Q, t) dQ, a classical potential. X(Q, t) is the co- 
ordinate of the oscillator in the distribution whose frequency is 2 while the total 
coordinate of the complete linear system with which f(t) is interacting is 


13 Eq. (4.18) is obtained from (4.17) using the identity 
limeo ((» — te)? — @)-! = (o? — 9) 4 (ix /2Q)[8( — 2) ~ 8 + 2)] 
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fo’ G(Q)X(Q, t) d®@ = X(t). It is the relationship between f(t) and X(t) in 
which we are interested in this classical case: 


L{X(Q), X(Q), t] = fo? G(@) dOl44X(Q)* — 140?X(a)"] 


‘ (4.19) 
+ f(t)fo° G(Q)X (2) da 
The equations of motion are the infinite set represented by 
X (a) + PX() = s(t) (4.20) 


They result from varying L with respect to the independent variables X(). 
For quiescent initial conditions and for f(t) applied at ¢ = 0, this solution is 
expressed 


X,(2)/f. = —[(v — te)? — WY* = [irZ,(@)]™ 
The relation of the total coordinate X, to f, is obtained simply, 
X,/f, = (fo? X.(2)G(2) da\(f.)* 


. = _, (4.21) 
= —f."G(Q)[(v — te)? — OY 'd® = (trZ,) 


Referring to Eq. (4.17) it is seen that the same expression for Z, is obtained in 
the quantum and classical cases. In addition, since Z, is now identified with a 
classical impedance, the real part represents resistance while the imaginary part 
corresponds to reactance. Therefore, at least for the case that loss is represented 
by distributions of oscillators, its effect can be included in the influence func- 
tional by using the appropriate impedance expression. The spontaneous emission 
of a particle in free space represents a good example of such a loss mechanism. 
A demonstration of this point is included in Appendix II where the oscillator 
distribution is related to the probability of spontaneous emission starting from 
the influence functional representing the effect of free space. 
The relationship 


(tvZ,)' = fo? B(t)e dt 


has already been established during the course of the derivation of the influence 
phase for the single oscillator. Now the inverse relation between F(t) and 1/7»Z, 
can be written for the zero temperature case. In the time domain the influence 
phase for the distribution of oscillators is 


1®(Q, Q’) = —(2h) fo” G(Q)Q™ dOf*~ fx (Q:-Q's) 
(Qe —Q'e78"™) ds dt 
Comparing this with Eq. (4.4) it is evident that 
F(t) = fo G(Q)2*e*' da 
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But from Eq. (4.18), 
Im(tvZ,)? = ~2G(v)/2v 
Therefore, it can be immediately written that 
F(t) = —(2/m) fo? Im(ivZ,)'e™! dv (4.22) 


as was given in Eq. (4.5b). 

The results above can now be extended by a simple argument to include all 
linear systems composed entirely of distributions of oscillators. To do this it 
need only be shown that the general system can be reduced to a distribution of 
oscillators independently coupled to the test system, which was the situation 
just considered. Explicitly, suppose there exists a test system Q, coupled to an 
assemblage of oscillators which are also interconnected with each other. For 
instance, the situation might be as in lig. 4, where each of the X, components 
of the total interaction system could also represent a system of oscillators. 
However, it is well known“ that such a linear system may be represented by an 
equivalent set of oscillators (the normal modes of the total system ) independently 
coupled to Q.° Or, stated another way, the classical representation of the La- 
grangian in normal modes finds new linear combinations of the X, which makes 
the total Lagrangian, except for the coupling, a sum of individual quadratic 
forms with no cross terms. But, this same transformation of variables can be 
made on the expression for $(Q, Q’) (see Eq. (2.7)). The effect of this trans- 
formation is to change the DX(t) volume by a numerical factor, since the trans- 
formation is linear.”* Thus, in effect, we get the sum of independent systems in 
the quantum mechanical case also. From this argument it is concluded that the 
results above regarding a distribution of independent oscillators coupled to a 
test system, apply to any linear interaction system. Therefore, it has been found 
that the influence functional for all linear systems has exactly the same form 
exp[?)(Q, Q’)] where #(Q, Q’) is a quadratic functional of the Q and 
Q’. &(Q, Q’) is adapted to a particular linear system only through the classical 
response of that linear system to a force. Thus, the procedure for finding the 
influence functional for a linear system has been reduced to a classical problem. 
The fact that eliminating the coordinate of an oscillator always yields an in- 


“4 This point is considered more fully in Section IV,B on classical forces. 

18 The fact that one or more of the X, might represent continuous distributions of oscil- 
lators need not be bothersome since in principle they represent the behavior of the total 
system in terins of its infinite set of normal modes. 

16 The only result of such a numerical factor would be to change the normalization of 
F(Q, Q’). However, we already know that for the case that the final states of the interaction 
system are summed over ¥(Q,Q’) = 1. Therefore the normalization of F(Q, Q’) is not changed 
by the transformation and thus is not dependent upon the coordinates chosen to represent 
the interaction systein. 
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Fig. 4. Test system coupled to an arbitrary assemblage of oscillators 


fliience functional which is quadratic in the potential applied to that oscillator, 
is a basic property of linear systems. For example, where the coupling Lagrangian 
is linear between an oscillator of coordinate X and another system of coordinate 
Q, the elimination of the X coordinate yields an influence phase which is quad- 
ratic in Q as has already been shown. If Q were the coordinate of another oscillator 
coupled to P, then elimination of the Q coordinates would yield an influence 
phase quadratic in /?, ete. This can be understood mathematically by observing 
that the Lagrangian for all the oscillators with linear coupling is always quadratic. 
Doing the path integral to eliminate a coordinate is basically a process of com- 
pleting the square and performing Gaussian integrals. This process of completing 
the square also yields quadratic terms. It is therefore not surprising that the 
influence phase for any linear system should be always of the same quadratic 
form. 

It is to be emphasized that the analysis so far presented has been concerned 
entirely with systems whose complete behavior can be described by combinations 
of lossless oscillators at zero temperature. The only example of such a system is 
the electromagnetic field in free space. In all other physical situations linear 
behavior is an approximation to the actual behavior. However, this approxima- 
tion may be very good over a wide range of operating conditions. In Section V 
the problem of approximately linear systems will be considered in detail. The 
results will be found to be the same as for perfect oscillators to the extent that 
linear behavior is realized. 


Form of Influence Functionals for Linear Systems and Classical Forces as De- 
duced from Properties of Influence Functronals 

So far, we have found the influence phase for classical potentials, uncertain 
classical potentials, and linear systems at zero temperature. By studying Eas. 
(3.1), (3.6), and (4.4), we see that the general form for the influence functional 
in which all three of these were acting on Q is 


§(Q, Q') = exp {f," 7Ci(t)(Qe — Qe) dt — J." fy’ A(t ~ 8) (Qe — OQ) 


e aay er a , , (4.23) 
-(Q, — Q,)dsdt — Jr J, tBi(t — 8)(Qe — Q1)(Q, + Qs) ds dt} 
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The exponent is written solely in terms of Q for simplicity although when the 
potentials are not linear in Q (as XV(Q)), the same general form exists, ex- 
cept that it is written in terms of V(Q). We now observe that there are other 
possible combinations of the Q, Q’ variables not represented here such as terms 
in (Qi + Q'1), (Qi: + Q's) (Q: + Q’.). To see if such terms are possible, let us form 
a hypothetical functional containing all possible forms up to second order in Q. 


5(Q, Q') = exp {fr” [2Cx(t) (Qe — Q's) + Dy(t)(Qi + Q’.)] dt 
— fe" fe [Ar (l — 8) (Qe — Q'1) (Qs — Q's) + IBi(t — s)(Q: — Q')(Q, + Q’,) 
+ iD2(t — s)(Qi + Q)(Q, — Q's) + Ds(t — s)(Qi + Q')(Q, + Q's)] ds dt} 
(4.24) 


That the coefiicients of the Q’s inside the double integrals should be functions of 
(t — s) is evident since the functional should not depend on the absolute time. 
We now will try to eliminate terms in the exponent by using the general properties 
of $(Q, Q’) given in Section II. First, we know ¥(Q, Q') = 5*(Q’, Q). This 
implies that all the functions A; , Bi, C:, D: , Dz , and D; are real. Next, we know 
that 7(Q, Q’) = Lif Q’(t) = Q(t). Hence, D, , Ds are zero. This leaves only one 
term which we did not have before, that of D,(t — s). Now we apply the property 
of these functionals requiring that if Q, = Q’, for t > t then $(Q, Q’) is inde- 
pendent of Q for ¢ > t. This statement is obviously true for the C,(¢) 
and A,({ — s) terms. Consider now the Bi(t — s) term. Fort > to, Q: — Q’. =0 
and therefore this term is also legitimate. As for the D,(t — s) term, let us con- 
sider t > & , buts < 4. Then Q, — Q’, ¥ 0. Furthermore, Q, + Q’, = 2Q, = 0. 
Therefore, D2(t — s) must also be zero. The fact that D2(t — s) = 0 is actually 
a statement of causality, i.e., that the effect due to an applied force cannot 
precede the time the force was applied. To see this, let us change the limits of 
integration on this term 


Se" f.' Da(t — 8)(Qe + QW(Q, — Q's) ds dt 
= fe’ fi" Dis — t)(Qe — Q's)(Q. + Q'.) ds dt 


Now the integrand is of the same form as that of the B(t — s) term. However, 
for a fixed ¢ the integration over s is over the range of s > ¢. This amounts to a 
sum over the future rather than a sum over past histories of the variable Q. 
The conclusion to be drawn is that there are three possible types of terms up 
to second order in Q and Q’ when definite classical forces, indefinite classical 
forces, and linear systems act on Q. Terms of this type have already been derived 
during the course of our analysis. Therefore, there are no major types of phenom- 
ena which have not been noticed. In the light of the above discussion we would 
expect the effects of additional phenomena, if they are described by terms of 
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second order or less in Q and Q’, to be contained in one or more of the three 
forms of exponents shown in Eq. (4.23). For instance, in the case of a linear 
system at finite temperature, it will be found that the effect of temperature is to 
change the effective value of Ai(t — s) in the exponent of Eq. (4.23) from its 
minimum value which occurs at zero temperature. It should be pointed out that 
although Ao(t — s) (see Eq. (4.5b)) occurs in a term which has the form of an 
uncertain classical potential acting on the test system, at zero temperature one 
must be careful about this interpretation, for the existence of a random classical 
potential implies a random fluctuation of the variables of the interaction system 
which could induce transitions in the test system either upwards or downwards 
in energy. However, if the interaction system is already in its lowest state it 
can only induce downward transitions in the test system so that the term in 
Ao(t — s) by itself is not sufficient. Thus, as has already been found, the ex- 
ponent of the zero temperature influence functional contains two terms, one in 
A(t — s) and the other in B(t — s), which are related through Eq. (4.5c). 
Together they give the whole picture, i.e., that there is a zero point, random 
fluctuation of the variables of the interaction system but that this fluctua- 
tion can induce only those transitions in the test system which, through spon- 
taneous emission, give up energy to the interaction system. 


B. INFLUENCE FUNCTIONALS FOR DRIVEN LINEAR SYSTEMS 


It is to be expected that if a classical force is applied to a linear interaction 
system which in turn is coupled to a test system, the effect of the interaction 
system is to modify the character of the force applied to the test system. In this 
section we will find the exact form for the influence functional of this effective 
force. If a linear system ts coupled to Q(t) through one of tts coordinates X(t) and 
if a classical force C(t) is coupled to another coordinate Y(t), then &(Q, Q’) repre- 
senting the effect of both the linear system and the force is 


#(0,Q') = 40(0,0') + (28) [” [ S602 = >) 


ey 


(4.25) 


+ C_(Q, = 2) dv 


—tve_y 


where ® ts the influence phase of the linear system is the absence of a classical force 
C(t), and z, ts a transfer impedance function which modifies the effect of C, on Q. 
The impedance z, ts found by computing the classical response of the coordinate X 
to the force C with all other potentials acting on the linear system (including those due 
to coordinates of external systems such as Q) set equal to zero. The result of the 
calculation yields zvz, = C,/X,. Alternatively, in the time domain, 


&(Q, Q') = (Q, Q') +h * fee fie (Qi — Q':)b(t — s)C,dsdt (4.26) 
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where” 
(ivz,) = fo b( te” dt 


The theorem can be stated in the form of a diagram as shown in Fig. 5. In this 
figure f(t) = fi. F(s)b(t — s) ds. It will be convenient to work in the frequency 
domain. 

First we recall the influence phase for a classical force acting directly on Q. 
From this expression we will be able to identify the character of the force acting 
on Q in more complicated expressions. If the potential is of the form —C(t)Q(t), 
we have, from Section ITI, 


&(Q, Q') = (2h) fo” (C,(Q_, — QL) + C_(Q, — Q',)] dv (4.27) 


Classical Potential and Q Coupled to the Same Coordinate 


Before developing the general situation we consider the simpler situation where 
Q is coupled to a linear system through the potential —QX, and a force F(t) is 
applied to the same system through the potential — FX. The Lagrangian for the 
complete system is 


L (system ) = Ly(Q, Q, t) + (F + Q)X + L(X, a oy ay y, t) (4.28) 


where X, Y, «++ represent all the coordinates of the linear system. If F = 0, 


6(Q,Q') = 6(Q,Q') 


i = f° [Q'(Q.— QL) , Q-(Q, — Q:) (4.29) 
= (on) f Oa Gol Jo 


If F ¥ 0 it is evident from Eq. (4.28) that the required influence phase can be 
found by replacing Q by Q + F and Q’ by Q’ + F. Notice that F does not carry 
the prime notation since it is not a coordinate. If this substitution is made in 
Eq. (4.29) we have 


, 1 (Q’, + F.)(Q_, — Q4,) 
&(Q,Q") = (nh) [ | ue 
4 Quo + PQ, =O) gy 
(—tZ) |e = 6(Q,Q>) +(27h) (4.30) 


[ | (5) @- a) + (E = 7) (@ a | dy 


The notation z, , b(¢ — s) was chosen to avoid confusion with Z, and B(t — s) which 
ure the impedance and response function respectively of the linear system as seen by the 
test system. 
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Fic. 5. Equivalent influences of a linear system and a force, F(t), acting on a test system, 


Q. 


As might be expected the total effect of the linear system and the driving force 
consists of two separate terms, one describing the effect of the linear system 
alone, and the other describing the effect of the driving force. Comparison of 
Ings. (4.26) and (4.30) shows that the effective force applied to Q is, in transform 
language, F’,/7vZ, and, further, shows that F, is modified by 1/z»Z, , the classical 
impedance function of the interaction system. In this special example where F 
and Q are both coupled to the same coordinate, Z, is both the correct impedance 
to be used in )(Q, Q’), i.e., that impedance seen by the test system, and is the 
transfer impedance z, which modifies F,. This is not true generally as we shall 
see in the next section. In addition, it is interesting to observe that no unexpected 
quantum effects appear because of the addition of a force to the interaction 
system. The only effect of the interaction system is to modify the characteristics 
of F(t) in an entirely classical way. 


Classical Forces Acting Through a General Linear System 


Having obtained an idea of the type of results to expect in the above simplified 
analysis we now proeeed to the more general case. Let the VN coordinates of the 
interaction system be represented by X;,7 = 1.--- N. Its coupling to the test 
system, Q(t), and to the driving force C(t) is given by the potentials — X,Q and 
— X,C, respectively. Thus we are assuming, for simplification in writing, that Q 
is coupled only to the variable X,, and the force C(t) is applied just to the variable 
X;. Again the interaction system is assumed to be composed entirely of harmonic 
oscillators. The Lagrangian is 
L (system) = L(Q, Q,t) + dois U4(TuiX X; — VisX.X;)] 431) 

+ XnQ + XC 


It is well known in the theory of linear systems that new coordinates may be 
defined by means of a linear transformation of the X;. These new coordinates 
will be chosen as the eigenvectors, Y;, of the interaction system (10). Thus, 


xX,= eval: t=1,2,---n 


283 


284 


INTERACTION OF SYSTEMS 145 


Assuming the a, to be properly normalized, the Lagrangian may be rewritten 
as follows 
L(Q,Q, Yi, Yr, t) = Le(Q, Q, #) 

+>) Bg(¥P — wr ¥7) + Yi(a@niQ + aniC)]) 


Since these are now independent oscillators coupled to Q the influence phase can 
be written down immediately, 


#(0,0') = D810,0) = Ds [ ke ~ Q'.) (5) 


(4.32) 


tw i(v) 
, a, 1 ‘ 
+ 2(Q — Q) (=3)] w+ ss (4.33) 


‘ Ent At C, ’ Ant AriC_» ; 
zh | paGy 8 8) Peay eo a’ | dy 


where 


| te Seay | (4.34) 


calculated classically. 
This cau be written in the form of Eq. (4.25) if we make the correspondence 


a, = ant 
twZ, r wZi(v) 
and 
te a. Ant Bet 
we, “FT wZilv) (435) 


Using Eqs. (4.34) and (4.35) we can now show that 1/7»Z, and 1/zvz, are equiv- 
alent to X,,(v)/Q, aud X,,(v)/C, respectively. Thus 


1 a ant a 2 Yi(v) 
wl, 7 »B wvZi(v) 7 » ae (24 Q, 


a (4.36) 
= poeta: Yi(v) | = Xn(v) | 
Ll Q, Nes eag Q loy=o 
and 
It Ant Akt Yi(v) 
ive, “T tvZ,(v) Do Ant Oat ax Cy |o,=0 
(4.37) 
= Ld Yo)! _ Xalo) 
C, Q=0 Cy le,—0 
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Equation (4.36) is a mathematical expression of the argument used earlier to 
find the appropriate impedance function to be used in the influence functional 
for a linear system acting on Q. The additional information obtained here in this 
regard is that when other forces are present they are to be set equal to zero when 
this computation is made. Equation (4.37) states the new result that the transfer 
impedance which modifies C, in its effect on the test system is to be found by 
computing the ratio of C, to the coordinate X,(v) to which the test system is 
coupled. The total force acting on the test system when several forces are acting 
on the interaction system is simply the sum of these forces each modified by the 
appropriate transfer impedance determined in the above described manner. 


C. Linear SYSTEMS aT FINITE TEMPERATURES 


The forms of influence functionals which are possible for linear systems have 
been established by an argument which utilized the general properties discussed 
in Section II. Each of these forms has already occurred in the analyses of classi- 
cal potentials, random potentials, and zero temperature linear systems. There- 
fore, the results to be expected here are one or more of the forms already ob- 
tained. 

The discussion is begun again with a single oscillator as our linear system, for 
simplicity. From this, the extension to distributions of oscillators is immediate, 
as it was in Section IV.A. The complete problem is set up in the same way as for 
zero temperature except that the initial state of the oscillator is not simply the 
ground state or any definite eigenstate. The effect of temperature is to make the 
initial state uncertain and it is properly represented by a sum over all states 
weighted by the Boltzmann factor e °*" where 8 = 1/kT, T being the tem- 
perature in this case. The final state is again arbitrary: therefore, a formal 
expression for the influence functional is 


5(Q,Q') = J 6 (Xr — X'r)Ko( Xr, X,) KY (X's, X,) 


, , (4.38) 
. Da Ne Pg, (Xz bn (X 2) dX, +++ dX ¢ 


where N, the normalization constant, is >_, ¢°". The ¢, represent the energy 
eigenfunctions of the oscillator unperturbed by external forces. The first problem 
is to find a closed form for the expression >on dn(X)dén(X’)e*™. This can be 
done by noticing that its form is identical with the kernel which takes a wave 
function from one time to another if we make the correspondence that @ repre- 
sents an imaginary time interval. If the times involved are ¢, and & , this kernel 
is 


Ko(X2, X1) = Don gn(Xe)bn®(X1) exp [— (¢/A) En(te — t)] 


4.39 
exp [(7/h) Sei} i 
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for the harmonic oscillator, and where the subscript 0 indicates the absence of 
external forces. For the harmonic oscillator the expression for S is easily obtained 
in terms of the initial and final positions X, and X» (2). Thus, 


Se = mw|[2 sin w(te = oy (ex? + X,’) cos w (te = t,) — 2X1Xo] (4.40) 


Utilizing Eqs. (4.42) and (4.43) and making the correspondence 8 = 7(t2 — t)/A, 
X, = X,, and X, = X,, we find that 


D> bn(X;, on" (Xr) exp (—BE x) 
= exp { —mw[2% sinh (Bhw)][(X/ + X7’) cosh (Bhiw) — 2X,X",]} 


Using this closed expression for the average initial state of the oscillator and the 
kernel for the driven harmonic oscillator (Eq. (4.7b)) the influence functional 
can be evaluated by evaluation of a series of Gaussian integrations just as in the 
zero temperature case. The result expressed in the frequency domain is, 


1©(Q, Q') = 160(Q, Q’) 
— (nh?) fo? wh[2ma(e™ — 1)J*8(v — &)|Q, — Q')? dv 


(4.41) 


(4.42) 


Thus, the influence phase is made up of two terms, the first of which is the effect 
of the oscillator at zero temperature. The second is recognized as having the 
same form which was found for an uncertain classical potential with a Guassian 
distribution. Therefore, the effect of finite temperature is to introduce a noisy 
potential acting on Q at the frequency of the original oscillator. The power 
spectrum of the noise produced by the finite temperature is 


o(v) = rh[2mv(e™ — 1)]'8(» — w) (4.43) 


To indicate more clearly the relationship of ¢(v) to the characteristics of the 
linear system it is instructive to extend this expression to the case of a distri- 
bution of oscillators G(Q) all at the same finite temperature. The resulting 
influence phase is 


16(Q, Q') = 4 So” o(Q, Q’)G(Q) a2 
+ (nl) fo" AG(v)[20(e™ — 1)" 1Q, — QP ar 
where in the distribution m has been set equal to unity. The first term is again 


the influence phase for zero temperature, while the second term again has the 
form of a noisy potential whose power spectrum 


$(v) = (Fir/2)G(») [oP — 1)T" 


Recalling the analysis of the distribution of oscillators, it is found from Eq. 
(4.18) that rG(v)/2 = Re(1/Z,). Therefore, the power spectrum can be written 


o(v) = h Re (1/Z,) [o(e™ — 1)]* (4.45) 


(4.44) 
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In the time domain the influence phase is 
3 = 1b) — (wh?) fo dvG(v)hilv(e™” — 1)] 
X Jee See (Qi — Q's)(Qs — Q's) cos v(t — 8) ds dt 


Comparing this with Eq. (3.6) for random classical forces we see that the 
correlation function of the noise due to the finite temperature is 


R(t—s) = for AG(r)[o(e™ — 1)" cos v(t — 8) dy 
—(2/r) fo? A Im (tvZ,)*(e™ — 1)7* cos v(t — 8) dv 


(4.46) 


(4.47) 


Tiually, if we write Eq. (4.46) in terms of F(t — s) as, for instance, in Eq. 
(4.4), we find that 


F(t) = fo” [(G(v)/v] coth (Bfv/2) cos vtdv + 7 fo” [G(v)/v] sin vt dv 
Thus, 
A(t) 


—(2/mr) So? Im (ivZ,)™ coth (Bhv/2) cos vt dv 


4.48 
—(2/r) fo? Im (évZ,) [1 + 2(e” — 1)7'] cos vt do ieee 


and 
B(t) = —(2/r) Jo? Im (¢vZ,)~* sin vt dv 


which are the more general counterparts of Eq. (4.5b). Notice, however, that 
only the relation for A(¢) changes with temperature. 

Thus, if a lincar interaction system is initially at a finite temperature, the 
resulting effect is the same, as far as the test system is concerned, as if the linear 
system were at zero temperature and, in addition, a random classical potential 
were connected independently to the test system. The power spectrum of the 
random potential is given by Eq. (4.45) and is related both to the temperature 
and to the dissipative part of the impedance of the linear system. The theorem 
is stated in terms of a diagram in I’ig. 6 where the power spectrum ¢(v) of the 
random force C’,, is defined by 


o(v) = 4r(C,C_1')8(v+ v') (4.49) 


This fluctuation dissipation theorem has a content which is different from those 
stated by Callen and Welton (11), Kubo (12), and others. It represents still 
another generalization of the Nyquist theorem which relates noise and resistance 
in electric circuits (13). These previously stated fluctuation dissipation theorems 
related the fluctuations of some variable in an isolated system, which is initially 
at thermal equilibrium, to the dissipative part of the impedance of the isolated 
system. This would be equivalent in our case to relating (Q”) when the test system 
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Fig. 6. Equivalent influences of a linear system at a finite temperature acting on a test 


system, 
| 3 | = 


Fia. 7. Linear classical force F acting on a test system 


is at equilibrium (F = 0) to the dissipative part of F,/Q, where F is a classical 
force actilig on Q through a linear potential as shown in I’ig. 7. However, we 
have shown that the effect of an external quantum system at thermal equilibrium 
on a test system can he separated into two effects, a zero point quantum term, 
which cannot he classified as pure noise, and a random potential term. Using 
this influence funetional approach we can find (Q’) due not only to the internal 
fluctuations of Q but also due to the effect of X. 

The fact that all the derivations so far have been exact, which is a consequence 
of dealing ouly with systems made up of distributions of oscillators, brings up 
two interesting aspects of the theory. The first one is that Z, does not depend on 
the temperature, only on the distribution of oscillators. Yet in any real finite 
system, it is to be expected that the temperature of a system does affect its 
impedance. The secoiid aspect is that when a force /(t), of any magnitude what- 
ever, is applied to the linear system, its temperature does not change although 
it is obvious that if z, has a finite real part the linear system must absorb energy. 
For example, we have showi that the effect on a test system of a linear system 
at a finite temperature acted ou by a force f(t) is the same as the effect of a 
linear system at zero temperature, a force dependent only on the temperature of 
the linear system, and the force f(t) modified by the transfer characteristics of 
the linear system. Figure 8 shows the situation where f, = F,/zvz, and C(t) is 
the random force with a power spectrum given in Eq. (4.49). The influence phase 
acting on Q is 


id = ib) + 1(2Qrh)™* fen (f,/tver) (Q_, — QL.) dv 
— (th) So? o(v)1Q, — Q',)? dv 


As can be seen, the addition of the driving force f(t) (which is denoted in the 
above expression by its Iourier transform f,) does not change either the imped- 
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Fig. 8. Equivalent influences of a linear system at a finite temperature and a force act- 
ing on Q. 


ance characteristics or the temperature of the interaction system (which would 
be reflected as changes in 4)(Q, Q’), z, , and ¢(v)). 

The fact that the impedance characteristics of the interaction system are not 
temperature dependent is a direct result of linearity. The incremental response 
of a perfectly linear system due to a driving force is independent of its initial 
state of motion or of that induced by other linearly coupled forces. Therefore, 
the random motion of the coordinate of the interaction system implied by a 
finite temperature does not affect its response to a driving force. 

To see the reason that temperature of the linear interaction system has no 
dependence on the applied force F(é), let us consider a linear system represented 
by a very large box whose dimensions we will allow to become infinite. In addi- 
tion, we assume that a test system is located in the box and is receiving portions 
of a classical (very large amplitude) electromagnetic wave which is being trans- 
mitted by an antenna also within the box. As long as the box is of finite size and 
has lossless walls, any signal transmitted by the antenna will find its way either 
to the test system or back to the antenna. Transmission of the signal will be 
effected through the modes of the cavity which have a discrete frequency dis- 
tribution, This system exhibits no loss since each mode represents a lossless 
harmonic oscillator. For the rest of the discussion it 1s convenient to separate 
the energy transmitted by the antenna into parts which reach the test system 
directly or through reflection from the walls. As the dimensions of the box are 
allowed to become infinite (ie., the distribution of oscillators describing the 
electromagnetic behavior of the box becomes continuous), the time required for 
the energy to reach the test system by reflection from the walls also becomes 
infinite. Therefore, since part of the energy is lost from the antenna and test 
systems, the box has, in effect, dissipation. However, because of the volume of 
the box is infinite, its temperature is not changed by this lost energy. In other 
words, the box has an infinite specific heat. In addition, the energy transmitted 
by the antenna will generally have different characteristics from that of the 
Gaussian noise associated with temperature and even if the average background 
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energy content of the box were changed it could not be properly described by a 
temperature parameter. 

All the results so far suggest that any linear system can be handled by the 
same rules that have been developed for systems of oscillators. This will be 
developed fully in the next section. However, we will assume this to be true now 
and conclude this section by applying the theorem just derived to obtain 
Nyquist’s result for noise from a resistor. Take as an example an arbitrary 
circuit as the test system connected to a resistor Ry(v) at a temperature T as 
shown in Fig. 9. The resistor comprises the interaction system. The interaction 
between the test system and the resistor is characterized by a charge Q(t) 
flowing through the test system and resistor and a voltage V(t) across the 
terminals. Let us associate Q(t) with the coordinate of the test system and V(t) 
with the coordinate of R(v). The interaction part of the Lagrangian is symboli- 
cally represented by —Q(t)V(t) since the current voltage relationship in R(v) 
is opposite to that of a generator. The quantity z7yZ, appearing in the influence 
functional is given by 


—[Q./V,] = —(ivR,)* = wv, 
Thus, it follows that 
Re (Z,)" = VR, 


Then the results of this section tell us that this situation, as shown in Fig. 9, 
may be replaced by a resistor at zero temperature (i.e., a resistor with thermal 
fluctuations appropriate to zero temperature but with the same magnitude of 
resistance it has at temperature 7) and a random classical voltage whose power 
spectrum is 

o(v) = hvR, (eh — 1)7 (4.50) 
as shown in I'ig. 10. 


Q(th— 


Fig. 9, Test system acted on by a linear system represented by an electrical resistance R 
at a finite temperature. 


v(t) 


OO 


Fig. 10. Equivalent interaction system of a resistor at finite temperature acting on the 
test system Q. 


152 FEYNMAN AND VERNON 


The mean value of this voltage is 


(v(t) = (2/r) fo’ g(r) dv 


4.51 
= fo? 2hvR,[r(e™ — 1)] dp ey 


For high temperature 6 «1, and we find that over the frequency range where 
Bhy <1, d(v) = kTR,. If R, is constant over the frequency range of interest 
whose limits are v; and », , then as a result of the noise power in this range 


(v(t)) = 4kTR(fo — fi) (4.52) 


where f is the circular frequency. This is the famous Nyquist result for noise 
from a resistor. Notice that the noise voltage generator must be placed in series 
with the resistor as a consequence of interacting directly with the coordinate of 
the test system, which in this case is Q(t), the charge flowing in the circuit. If 
the coordinates were chosen such that Q(t) were the coordinate of the resistor 
and V(t) that of the test system, then the noise generator would become a current 
source interacting with the voltage V(t) so that the situation would be as shown 
in Fig. 11, where (2’) = (4kT/R) (fo — j:) in the high temperature limit. 

It is worth mentioning, but obvious from the derivation, that if there were 
many sources of dissipation coupled to the test system, each at different tem- 
peratures, then there would be a fluctuating potential associated with each 
source of dissipation with a power spectrum characteristic to the temperature 
involved. The case of different temperatures represents a nonequilibrium con- 
dition in that the hot resistors are always giving up energy to the cold ones. 
However, when the resistors are represented by continuous distributions of 
oscillators as in the case of an infinite box (free space), the temperatures do not 
change because of the infinite specific heat of the ensemble of oscillators. 


V. WEAKLY COUPLED SYSTEMS 


We are now faced with the problem of finding influence functionals whose 
behavior is in some sense linear but whose total behavior is not representable by 
systems of perfect oscillators. There are many examples of this. The concepts of 
resistance, electric and magnetic polarizations, etc. are basic quantities which 
characterize the classical electrical behavior of matter. However, for an accurate 
description of this behavior, these quantities can be constant, independent of 
the applied electromagnetic field only in the range of approximation that the 


Q(t}h—~ 


© oe ne 


Fig. 11. Equivalent interaction system of a resistor at finite temperature when the co- 
ordinate associated with the resistor is the charge flowing through it. 
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magnitude of this applied field does not become too large. We will consider here 
the analysis of such systems and the approximations which make linear analysis 
valid. Insofar as linear behavior is obtained, the results of this section are basi- 
cally the same as those obtained in previous sections with regard to finding in- 
fluence functionals. However, it is interesting to see the same principles come out 
of the analysis another way. In addition, it will be found that expressions cor- 
responding to Z, which appear in the influence functional are actually closed 
forms which can be used to compute such quantities as the conductivity from a 
knowledge of the unperturbed quantum characteristics of a system. These ex- 
pressions have been derived before by several authors but it is interesting to 
find that they also appear in the influence functional quite naturally. The results 
will then be applied to the case of a beam of non-interacting particles passing near 
a test system such as a cavity. This analysis naturally lends itself to a discussion 
of noise in beam-type maser amplifiers. In Appendix ITI, the results of this and 
Section IV are used to compute the spontaneous emission of a particle in a 
cavity. 

A, INTERACTION SYSTEMS WITH CourLinG POTENTIALS OF THE FoRM 

—V(Q)U(P) 

The specific result to be shown here cau be stated as follows: If a general 
interaction system, P, is coupled to a test system Q so that the interaction po- 
tential is small and of the form —V(Q)U(/’), then the effect of the test system 
is that of a sum of oscillators whose frequciicies correspond to the possible transi- 
tions of the interaction system. Therefore, to the extent that second order per- 
turbation theory yields sufficient accuracy, the effect of an interaction system 
is that of a linear system. 

To show this, we shall first assume the interaction system to be in an eigenstate 
¢a(P,) exp {— (¢/h) Ear} at the beginning of the interaction and in an arbitrary 
state at the end of the interaction consistent with the usual procedure we have 
followed. Also, for convenience in writing, the interaction potential will be 
assumed — U(P)Q. The influence functional is then 


5(Q, Q') = Jf 5(Pr — P’r) exp {(2/A)[S(P) — S(P’) 
+ §.7 (QU — QU’) dt}}¢.* (P’,)a(P,) dP, --+ DP’ (t) 


where in the above we have written U for U(P) and U’ for U(P’). Since the 
magnitude of the interaction is assumed to be small, the perturbation approach 
can be used to good advantage. Thus, expanding the interaction part of the 
exponent and keeping terms to second order in Q only, 


F(Q, Q') = J 5(Pr — P’r) exp {(¢/A)[S(P) — S(P’)}} 
K {1 + (4/h) f.7 (QU — QU’) dt + (a/h)? 7S, (QU, — QU") (8.2) 
X (QU; — Q'.U',) ds dt} oa" (P’s)ba(Pr) dP, «++ DP’ (t) 


(5.1) 
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The evaluation of this is done in an entirely straightforward manner and the 
following is obtained,” 


F(Q, Q') = 1+ (iUea/h) fe” (Qu — Q's) dt — Ye |Uas/ Ail’ 
x fe" Se (Qe — QDEQs exp [—tena(t — s)] — Q' exp [dwa(t — s)]} ds dt 


where 


(5.3) 


Us = fa" (P)Ud(P) dP and wa = (Ey — Ey)/h 


All terms which involve Ua, will be disregarded. This is because U,. represents 
the average value of the operator U(P) in an eigenstate of the interaction system 
alone. Even if it is not zero, it will be noted from Eq. (5.2) that terms involving 
U,, can be written 


1+ (1Ua0/Ti) $e" (Qt — Q's) dt + (1/2!)[(GU aa/h) Se” (Qi — Q's) dt + >>: 
= exp {(¢/h) J,” Uaa(Q: — Q's) df} 


When §(Q, Q’) is used to make a calculation on the test system, this term has 


the effect of adding a constant potential V(Q(t)) = —UaQ(t) to the unper- 
turbed test system. Disregarding U.., the influence functional then becomes 

5(Q, Q’) = 1+ a (5.4) 
where 


te> (3) SX fice veve(Q: — Q':)[Q.F.*(t — s) — Q'.Fa(t — s)] ds dt 


and 
F(t — s) = Lp (|Urel’/%) exp [iore(t — s)] 
where the limits have been extended and the factors y,y, inserted to allow finite 
8 A typical term in Eq. (5.2) is as follows: 
fa(Pr — P’p) exp{(i/hIS(P) — S(PNG/MYT JE QU QU, ds at 
X ¢a*(P')ba(Pr) dP, +++ DP (t) 


Taking the time integrations outside the path integral and replecing the path integrations 
by propagating kernels, this becomes 


G/aysy St QQ. ds dtfa(Pp — P’r)K*(P'p , P')K (Pr, P.) 
U.K (P. ’ PI)UK(P, , P,)¢0*(P',)ba(Pr) dP, es dP'r 


Remembering that K(Pr, P:) = Don(Pr)on*(Ppexp{ iE, (1 — t)/h}, this expression 
becomes simply, 


— Do | Veo/h |? 57 ff QQ.exp[—twra(t — s)} ds dt 
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coupling time if necessary. If the strength of coupling is sufficiently weak then 
Eq. (5.4) can be rewritten 


& exp (a, ) (5.5) 


In this form we recognize ¥(Q, Q’) as that describing the effect of a sum of 
harmonic oscillators independently connected to the test system each of whose 
“weights” is UsUa/h. The complete response function for the system of 
oscillators is 


B,(t— s) = ImF.(t — s) = oe (2|Usal’/A) sin wea(t — 8) 


where the subscript a on B, refers to the initial eigenstate. According to previous 
definition, the mass of each individual oscillator is identified by m = fi(2|Uea|"wea) 
and its characteristic frequency by wa. Therefore, to the extent that second 
order perturbation theory yields satisfactory accuracy, any system may be 
considered as a sum of harmonic oscillators. This is equivalent in classical 
mechanics to the theory of the motion of a particle having small displacements 
around an equilibrium position. Its motion, to a first approximation, is also 
that of a harmonic oscillator, if the first effective term in a power series expansion 
of the potential around that equilibrium position is quadratic in the displace- 
ment. 

In this part of the discussion we should again point out the motivation for 
writing the approximate influence function, Eq. (5.4), in terms of the approxi- 
mate exponential of Eq. (5.5), apart from the obvious advantage of making the 
form agree with that of exactly linear systems. Frequently, we deal with a test 
system which is influenced by another system which is actually made up of a 
large number of very small systems. Examples of such an interaction system 
would be a beam of atoms or the electrons in a metal. Although the expression 
for the influence functional for any one of the subsystems is only good to second 
order, their individual effects are so small that this accuracy is very good and 
Eqs. (5.4) or (5.5) is equally valid. However, when the sum of the effects of the 
subsystems is not small, then the two forms above do not describe the situation 
equally insofar as the composite effect of the interaction system is concerned. 
We know that when these subsystems are dynamically and statistically inde- 
pendent, the total influence functional is simply a product of the individual ones. 
In such a case the influence functional obtained by using Eq. (5.5) as follows 


.(Q, Q') = exp {7 ik (Q, Q')} 


yields much greater accuracy than that obtained from Eq. (5.4) where we would 
find 


5(Q,Q') H1 +72 &(Q, ) 
#,(Q, Q’) being the influence phase for the kth subsystem (see Appendix IV). 
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The expression given by Eq. (5.5) has additional implications which are not 
immediately apparent from the analogy with the harmonic oscillator. No 
assumption was made as to whether the initial state was necessarily the lowest 
of the system. Therefore, u,, could be either positive or negative. Suppose for a 
moment that the interaction system has only two states (¢.(P) and ¢(P)) and 
that the initial state, a, is the lower one (uw. > 0). It is obvious that the only 
effect it can have on the test system Q is to absorb energy from Q. However, if 
the initial state, a, is the upper state (corresponding to ux, < 0) then the inter- 
action system can only give up energy to the test system. It can do this in two 
ways, through spontaneous emission or through coherent emission due to some 
coherent driving force exerted on it by the test system. So for the case uw», > 0 
we expect the influence functional to show that the interaction system has the 
effect of a cold system characterized by a dissipative impedance (or positive 
resistance). Conversely, for ws. <0 it is expected that ¥(Q, Q’) will be char- 
acterized by a negative resistance and a random potential due to the spontaneous 
emission transitions. This situation is made more obvious if we translate Eq. 
(5.5) into transform notation. Thus, 


2(0,0°) = (aenyt [° [PKG Oe) 5 BAGO) Lay 


— (ak?) a 1Q, — Q's|?x |Ueal3(v + aye) dv 


(5.6) 


where 
tvZe0,» = —h(QwealU sal’) [(v — te)” — wiu] (5.7) 


First of all we notice that the sign of Z;.,, changes with that of w. and therefore 
its dissipative part can be positive or negative as was argued above. Secondly, 
for wo < 0 there is a random potential acting on Q whose power spectrum is 
given by $(v) = 1lUs.|6(v + wea). Of course when ws. > 0, the integral in- 
volving this term disappears indicating the noise potential does not exist and 
the effect is the same as that of a harmonic oscillator initially in the ground 
state with 1/2mw identified with |Usa|"/h. 

In a real physical situation it is not likely that the interaction system will be 
in a definite state initially. So to extend the above results to a more general case 
we assume that the initial state is described as a sum over states weighted by a 
density matrix p(r) which is diagonal in the energy eigenstates of the 
system. For example, if the system is initially in temperature equilibrium, 
p =e *"/T,(e *) where H is the Hamiltonian operator such that pm, = 
Sma XP (—BE,)/ >on exp (—BE,). The influence functional becomes 


5(Q, Q') = J8(Pr— P’r) exp {(i/A)[S(P) — S(P‘) 


rT ats * , t (5.8) 
+ $f.” (UQ — U'Q') dt}} 2. paobe*(P’,)ba(P-) dP, +++ DP’ (t) 
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Within the limit of small coupling, then we can simply extend the influence 
phase of Eq. (5.5) by summing over all initial states weighted by the initial 
density matrix p.,. If this is done, we obtain the usual form of the influence 
phase, Eq. (5.4), with a response function B(t — s) given by 


B(t — s) = Yeap (2pa|Ual’/h) sin w(t — s) (5.9) 


Again ®(Q, Q’) is the phase for a sum of oscillators, each of whose weights is 
peal Uo! /h.” In Section IV it was shown that B(t — s) was the classical response 
of the linear system to an impulse of force applied to the coordinate U(P). 
However, the expression above is in a form which is familiar to us only when we 
think of the interaction system, P, as consisting of a sum of oscillators. In 
this connection B(t — s) is the total classical response of the oscillators describing 
the system P to an impulse of force applied to the “coordinate” U(P).” To 
obtain a more direct interpretation of Eq. (5.9) we now calculate the linear 
classical response B(t ~ s) of the interaction system to an applied impulse of 
force in terms of its unperturbed quantum characteristics. In so doing we will 
show that Eq. (5.9) is indeed this expression. Therefore, we will again have the 
result that the influence functional for a general, linear interaction system is 
formed simply from a knowledge of its classical characteristics just as in the case 
of systems of perfect oscillators. By the “classical characteristics” of the quantum 
mechanical system we mean the expected value of U(J’) as a function of time 
after a potential —f(t)U(P) is applied starting at t = 0. Thus 


(U(PD)) = SV(P)U(P)W(Pd dP: (5.10) 


where ¥(P;) represents the state of system P at ¢t. Using the path integral repre- 
sentation for the development of a wave function with time, as outlined in 
Section II, this can be written 


(U(P.)) = J Ud(P: — P's) exp {(2/h)[So(P) — So(P’) 

+ fo’ f(s)(U. — U',) ds}}p"(P'o)¥(Po) dPo +++ DP'(t) 
Alternatively, this expression can evidently be written 
< 
of 


1 Tt ig interesting to notice that the relative populations of any two levels may be de- 
scribed by an effective temperature 7, = 1/k@,. . For instance if the probabilities of occupa- 
tion of states a and b are fac and py, respectively, we use the definition psc/pm, = expB.(H, — Ec). 
If pee = O this is described by setting 7’. = 0+, meaning to approach zero from the positive 
side. Similarly, if the two states were inverted px», = 0 and T, = 0—. This device has been 
used widely in the description of such situations. 

2 U(P) may be regarded as a coordinate which is a function of other coordinates P in 
terms of which we choose to describe the interaction system. 


(U(P1)) = -tha Ff 5 F's) ler 
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The initial definite state yo."(P’)yo(P) will be replaced by an average state 
described by the following density matrix diagonal in the energy representation, 
DY paabe® (Po) b«(Po) so that 


v"(P)W(Po) = Va pacda” (P’0)ba(Po) 
Assuming f(s) to be small in magnitude, Eq. (5.10) can be written to first order 
(U1) = Ye poo S Uids(Pi — P's) exp {(i/h)[So(P) — So(P’)]} 
{1+ (a/h) So’ f.U.ds — (i/h) So’ f.U". ds} 


Piss: ; (5.11) 
‘bo (Po)be(Po) dPo +++ OP (t) 
= Ye pool Vac + Lis(i/h)| Uae |? fo’ fe exp [—Zura(t — s)] ds 
— ) (¢/h)| Uae P So'fs exp (wu(t — s)] ds} 
Again assuming U,. = 0, Eq. (5.11) becomes 
(U1) = Deer (2pea| Vas |?/K) So’ fe sin woa(t — s) ds (5.12) 
For f(s) = 6(s), then, the classical response function is” 
B(t) = Yeas (2paa| Ueo |'/A) sin orat (5.18) 


which is identical with the response function found in the influence functional. 
Since we have found expressions identifiable as classical response functions 

and impedances, it remains to show that associated with the dissipative part of 

the impedance is a noise potential. The impedance is simply obtained from™ 


(ivZ,)* = So? B(t)e” dt 
Does (—2paa Wa | Ure P/A Cy —ie)’ re weal (5.14) 


To obtain the power spectrum it is only necessary to sum the influence phase 
of Eq. (5.6) over all initial states weighted by p. . Thus we find 


6(¥) = dios tpaal Urs [2 5(¥ + we) (5.15) 


and we now wish to relate this to the real part of 1/Z,. From Eq. (5.14) it is 
found that 


Re (Z,)7' = doas (rv | Usa [2/%) (08 — Pa )d(¥ + wa) (5.16) 


2 [t should be noted that implicit in the use of first order perturbation theory to obtain 
Eq. (5.18) is the fact that for this relation for the response function to hold as 2 steady- 
state description of the linear system, the initial distribution must not be significantly 
disturbed by the application of the driving force. 

22 Expressions of this kind have been used by several authors to compute quantities such 
as the conductivity of materials. See, for instance, ref. 14. 
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Rewriting $(») 
$(¥) = Dawe | Vas |2(0% — Pea) ((P2/Pea) — 1 '8(¥ + we) (5.17) 


If the average initial state of the interaction system is one of temperature 
equilibrium, then pas = ¢ °"*/ >on 68" and pre/paa = 2 * = eh. Taking 
advantage of the characteristics of 6(» + uw.) so that » can replace wa, from 
Eqs. (5.16) and (5.17) 


b(v) = Doo (/h) | Usa [2 (pee — paa)d(¥ + woa)hilv(e™” — 1)]* 


5.18 
hi Re (Z,) [vie — 1)y" ee 


Therefore, again we find that the power spectrum is related to the dissipative 
parts of the impedance when the initial state is one of temperature equilibrium.”4 
The power spectrum given in the form of Eq. (5.17) again illustrates the origin 
of thermal noise and identifies it as being just another aspect of spontaneous 
emission. Pound has also discussed this (15). The only contribution to the noise 
power spectrum is through the possible downward transitions of each possible 
state, a, weighted by the statistical factor, pac. 


B. BEAM oF PARTICLES INTERACTING WITH A CAVITY 


In the cases just considered the interaction system provided a steady-state 
environment for the test system. In contrast, let us now examine the situation 
where the interaction system is made up of a large number of independent 
particles coupled to the test system at different times. As an example consider a 
beam of noninteracting, identical particles which interact weakly with a resonant 
cavity as might occur, for instance, in a gas maser. We assume that the beam is 
not necessarily in temperature equilibrium but that the initial state of the 
particles entering the cavity would be properly represented by a density matrix 
diagonal in the energy representation. Such a situation would occur if the beam 
were prepared by passing it through a beam separator whose function would be 
to eliminate certain particles from the beam depending on their energy levels. 
For the purposes of simplifying the analysis we assume the molecules to be two- 
level quantum systems and that before they enter the cavity all of them are in 


23 Notice that if P were a two-level system initially in the lower state, then ps = 0 and 
T, = 0+. In this case ¢(v) = 0 which agrees with the required result for » > 0 (Eq. (5.6) 
for wa > 0), If initially P were in the upper state then pac = Oand T, = O— yielding ¢(v) = 
—(h/v)Re(1/Z,) agreeing with Eq. (5.6) for wie < 0. This is the power spectrum of the so- 
called spontaneous emission noise from an inverted two-level system. 

«Jt may be disturbing that Re(1/Z,) contains singular forms such as 8(v + wa). How- 
ever, the infinite sums over the distribution of states of which it is a coefficient can be re- 
placed by integrals over densities of states in most practical situations and as part of an 
integrand 6(v + we.) is not unrealistic. 
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the lower state or all in the upper state. It is easy to extend these results to the 
case where the beam is mixed with a certain fraction in the upper state and a 
certain fraction in the lower state initially. Since the beam is assumed to be 
composed of noninteracting particles, we can consider the total beam as composed 
of two independent beams appropriate to the two possible initial states of the 
constituent particles. The influence phase for the complete beam is simply the 
sum of the influence phases for the two beams. In addition, we assume that the 
beam is characterized by a spatial density such that the number of particles 
passing a given point along the beam in a time dt is Ndt and if t& is the time a 
molecule passes a reference point in the cavity, y({ — t) describes the coupling 
between the molecule and cavity. Thus, the beam is a univelocity beam. Again, 
in a real case where the beam is characterized by a distribution of velocities, the 
total beam may be split up into many univelocity beams. The total influence 
phase is simply a sum of those for each component beam. 

Let us call the coordinates of the cavity Q, representing the test system and the 
coordinates of a particle in the beam P. The interaction between beam and 
particles is given by £,(Q,P) = y(t — t)QP. Under these circumstances the 
influence functional for the effect of the beam on the cavity can be written down 
immediately : 


ib,(Q, Q') = —N | Pas/fi |? f20 Si ds dt [ fen y(t — to)v(s — to) dts] 
X (Qi — Q') (Qe? — Qi ewe) 


The integral involving the coupling parameters can be used to define a new 
function I, 


(5.19) 


fro v(t — to)y(s — to) dip = froy(E)y(E — t+ 8) dé 


T(t — s) 


(5.20) 


Therefore, 
18(Q, Q') = —N | Pas/Fi |? S20 foe P(t — 8) 
X (Qe — Q'1) (Que er? — Qref) ds dt 
From this we can identify the response function of the beam as 
B(t— s) = (2/A)N | Pa PT (t — 8) sin on.(t — 8) (5.22) 


In transform notation the influence functional for the effect of the beam has the 
same form as has been previously derived with 


(tvZ,)' = (2/R)N | Pas (Tro, — Torene) (5.23) 


(5.21) 


where 


Rag =a IGYR@)e "de (5.24) 
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and with a power spectrum 
o(v) = 46N | Pa i Ree + | Per ew ae (5.25) 


Previously, we have found that for a linear system initially in the ground state 
the power spectrum is zero over the range of positive »v thus indicating a zero 
noise potential due to the linear system. However, ¢, is not necessarily zero for 
vy >0 in Eq. (5.25) in the case that the beam is initially in its lowest state 
(wa > 0). This is because no restrictions were placed on the time variation of 
+(t — ty). However, in practical situations the coupling between a cavity and a 
particle in a beam passing through the cavity varies adiabatically so that for all 
practical purposes ¢(v) is really zero for vy > 0. To point this out more clearly, 
let us assume that y(t — t) = 1(t — to)1(t +70 — 2), that is, the coupling 
is turned on at é and off at t + 70, the time of transit in the cavity being 7. 
We find by evaluating Eq. (5.20) that 


T(¢— s) = (s —t+70)1(s — t+ 10) 
and from this we can find 
Tyyeog = foo (to — 7)1( ro — r)1(r) exp [—2(v + wea)r] dr 
{1 — tro(v + wa) — exp [—2(v + wa) rol} (¥ + wa)” 
Therefore, from Eq. (5.25), 
o(v) = (LQ)N | Pap |? ro? (sin?6/ 6) (5.26) 


where 
6 = Le(v + ora) 70 


To find the effect of suddenly turning the coupling on and off we find the ratio 
of the total noise power to the noise power for v > 0. This is given by 


fe2d(v) dv _ f%. (sin? 6/6) do _ 
So o(v) dv SBiqro 0/0” 


= WWba T0 

For ammonia molecules at a temperature of T = 290°K, the average velocity 
v & 6 X 10‘ cm/sec. For a microwave cavity of 10 cm length ro = 2 X 10° sec. 
For the 3-3 line of ammonia owm21.5 X10" rad/sec. For our case 
then (rwaro) & 10°. From this, we can conclude that even in this unfavorable 
case of coupling time variation, the ¢(v) is negligible for v > 0. 

Examination of Re(1/Z,), derived from Eq. (5.23), reveals terms of the 
same form as those just discussed, 


Re(1/Z,) = M[(1/Z,) + (1/Z,")] 
= (¥N/2h) | Pa pierre = Dicuss Tent, Pee, 


(5.27) 
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By the same type of argument as above, the terms in I,_.,, can be neglected 
when v > 0, we. <0 and conversely, when »v > 0, ox. > O the terms in I,,.,, 
can be neglected. Since this is the case, we can write 


o(v) & hi Re (1/Z,)[v(e 6" — 1)? (5.28) 


where 6, = 1/kT,, describes the relative initial populations of the two states. 
Therefore, we conclude that in most practical cases the power spectrum can be 
written in the form of Eq. (5.28). In cases where the transients cannot be 
neglected for very low u,, and very short transit times, however, ¢(v) is not so 
simply related to Re(1/Z,) and must be written in the form given by Eq. (5.25). 

It is to be noted that ¢(v) of Eq. (5.28) is not precisely of the form for the 
Nyquist relation because of the appearance of e** in the denominator rather 
than e*”’. This is a consequence of the finite coupling time between each part of 
the beam and the cavity which results in a nonequilibrium condition. If the 
coupling times were infinite, the expression for Re(1/Z,) would contain forms 
such as 6(v + wy), a Situation discussed earlier, so that the Nyquist form then 
results. However, when the coupling time is long as in masers, 


Ty4050 + Boas wy 4n5(v + Wea) 
so that the true Nyquist relation may be used with negligible error. 


VI. SOURCES OF NOISE IN MASERS 


Having developed the theory of linear systems in detail we are now in a posi- 
tion to discuss the sources of noise in linear quantum-mechanical devices such 
as maser amplifiers. The subject of maser noise has been explored by many 
authors (15-19). The details of the treatment given the subject differ, but the 
principles are essentially the same. The amplifiers are considered as operating at 
signal levels high enough (classical) that a signal entering a maser may be con- 
sidered as a group of photons whose number is large enough that the amplifica- 
tion process increases the signal in a continuous fashion. The sources of noise 
were found to be those derived from the thermal noise arising from the sources of 
dissipation, and those derived from spontaneous emission from the “active” 
quantum material. They go further and define an effective temperature of the 
active quantum system so that the noise it produces is related to the negative 
resistance of the active materials. Our analysis of linear systems has also shown 
that these same sources of noise exist. However, if the signal level entering the 
maser is very small such that its strength can be characterized by a few quanta 
per second, a serious question arises as to the nature of the signal out of the maser 
(here assumed to be at a classical level). An additional fluctuation of the output 
signal or “‘quantum”’ noise might be expected due solely to what might be termed 
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a “shot noise” effect created by each individual photon entering the cavity. 
It has been shown that no such signal exists by other authors (20, 21). We wish 
to show how the same result follows from the theory developed here in an ex- 
tremely transparent way. We will show that the only fluctuations in output 
signal which are to be expected are those noise sources computed classically. 
No additional ‘“‘shot noise” does, in fact, appear. 

Let us suppose that we have a beam-type maser amplifier in which all partici- 
pating systems used meet the requirement of linearity. There may be one or 
more beams interacting with various electromagnetic resonators which can be 
coupled together in any way desired. The output of the maser is connected to a 
detector of some sort which perhaps consists of a resistor in which the current 
is to be measured. To the input of the maser system we now apply an incoming 
classical signal of large magnitude and of frequency w through an attenuator 
whose value of attenuation may be varied at will as shown in Fig. 12. In practice 
such a situation could arise if the classical wave originates from a distant antenna 
with a very large magnitude of output, so large that all quantum effects in the 
wave are effectively obscured. The long distance would then play the role of the 
attenuator. 

Now, if the classical wave were attenuated by a large amount so that only a 
few photons/sec were entering the maser, the only uncertainty in the signal in 
the output of the maser caused by the maser itself arises from those sources of 
noise which can be arrived at by a classical calculation of the characteristics of 
the maser. There is no extra quantum fluctuation introduced by the maser into 
the output signal due to the small number of quanta entering the maser. 

It is true that the amplitude of the signal output from the maser might itself 
be so small that it is still on a quantum level. In this case the detector output 
would be uncertain due to the inherently small magnitude of the signal from the 
maser. However, if this is the case, we may put as many amplifiers in series as 
necessary to bring the output signal back to a classical level. When this is done 
the signal applied to the detector consists of the original signal modified by the 
transfer characteristics of the maser system (and attenuator) and noise signals 
which arise from all the possible sources computed classically. The proof of this 
assertion is not difficult. We divide up the total system into a test system, here 
the detector, and an interaction system which consists of the maser, attenuator, 
and classical signal, C'(¢). Then, to find the effect of the interaction system on the 


CLASSICAL A 
WAVE ‘TTENUATOR, MASER DETECTOR 


Fig. 12. System in which a classical wave is attenuated to a very low level (a few photons/ 
sec), then amplified by a maser and detected. 
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detector we need only to look at the influence functional. However, we already 
know that it can be written as follows: 


ty t - Q'(Q_. — Q’,) Q_.(Q, a Q',) 
MO ee (ey) f | (ivZ,) Tg) |e 


- (4) if (<) (O80 ave (=) [1a = edo i) 


where ¢,(v) represents the power spectrum of the noise from the zth source, 
z, is the classical transfer characteristic of the complete interaction system, and 
Z, is the impedance of the maser system as seen by the detector. All the terms 
in the influence functional are familiar in view of the derivations which have 
been presented previously. The first term in the exponent of the influence func- 
tional is recognized as describing a linear system at zero temperature (Section 
IV), the linear system in this case being the maser. This term describes the 
spontaneous emission of the detector back into the maser (see Appendix I). Fur- 
thermore, it can be deduced that this spontaneous emission can be thought of as 
resulting from a noise generator created by the detector (test system) acting on 
the interaction system in the usual classical way, i.e., whose power spectrum was 
related to the dissipative part of the detector impedance and to the temperature 
by the Nyquist relation 


$(v) = fivR, [exp (fiv/kT) — 1J" 


where F, is the detector resistance and T its temperature. The second term in 
the above composite influence functional is easily interpreted and is simply the 
effect of a classical voltage, related to the input voltage by the classical transfer 
characteristic of the maser, acting on the detector (see Section IV.B). The last 
term represents the effect on the detector of random noise voltages associated 
with the various classical noise sources in the maser (Sections IV.C and V). 
Both positive and negative resistances are such noise sources. In either case the 
power spectrum of the noise from a particular resistance is computed from the 
same relation as given above. 

If R, is negative the effective temperature of R, will also be negative always 
giving a positive power spectrum. Therefore, if we were to compute the current 
in the detector due to the interaction system (maser) using the influence func- 
tional we would find components of current due to: (1) the noise voltage gener- 
ated by the detector itself, the power spectrum of which is related to the resist- 
ance of the detector by the generalized Nyquist relation given above; (2) a 
classical voltage related to the input voltage C(t) by the classical transfer char- 
acteristic of the maser; and (3) random noise voltages associated with the 
various classical noise sources (resistances) in the maser. Therefore, the maser 
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simply acts as a classical amplifier with sources of noise which can be predicted 
from considerations of its classical characteristics. 


APPENDIX I 


In Section II.E the problem of making perturbation calculations using in- 
fluence functionals was outlined. Here we will calculate in detail the probability, 
to second order in the coupling potentials, that a test system which is in a 
definite state at ¢ = r and finds itself in another state at ¢ = T when interacting 
with a zero temperature interaction system. Let us call the initial and final states 
¢n(Q,) and ¢,(Qr) respectively and, for simplicity, assume that these are eigen- 
states. The formal expression for the transition probability is given by Eq. 
(4.24) and for this case, 


F(Q, Q’) = exp {—(2%)7 fre fre vere(Qi — Q's) 1) 
x [Q.F*(t — s) — QLF(t — s)} ds dd} 
Then to second order 
Pam © fom’ (Qr)bm(Q'r) exp {(2/%)[So(Q) — So(Q')]} 
X {1 (28) foe Soe re1e( Qe — WU )QU(t — 8) — Q'R(t— 8)]dsdt} (1.2) 
X bn” (Q'r)on(Qr) dQ, +++ DQ'(t) 
Making use of the fact that 
J exp [(2/h) So(Q)]DQ(t) = K(Qr , Qr) 
= { K(Qr,Q:)K(Q:,Q-) dQ: (where T <t < 1) 
and writing (1.2) as a sum of integrals 
Pam & f dn™(Qr)dm(Q'r) exp {(¢/h)[So(Q) — So(Q')} 
‘on™ (Q's )dn(Qr) dQ, +++ DQ'(t) 
— (2h)" fee Sie ds dtysyt S om" (Qr)om(Q'r) 
-exp {(¢/h)[So(Q) — So(Q’)]} 
x (Q — QM )QF*(t — s) — QUF(t — 8)] 
‘on™ (Q')bn(Qz) dQ, +++ DQ"(t) 
Replacing K(Qr,Q:) by doz ¢e(Qr)¢x*(Q:) exp [—¢E,(T — t)/fi] and taking 
matrix elements we have 


Pan = damll ; Do ( 2%) | Qnk Pf(var) | + (2h) j Qn IF Yam) (1.4) 


(1.3) 
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where 
f(vam) = Se Siw vevelF*(t — seam + F(t — sem] ds dt (1.5) 


f(») can be simplified by restating the integral over ¢ and s in terms of frequency. 
To do this we replace the upper limit t by + © and multiply the integrand by 
a step function 1(¢ — s). Then utilizing the convolution theorem in the form 


Jew S20 M(t)N(s)R(t — s) dsdt = (2r)' f M_N,R, dv 
where 
M(v) = f2. M(t)e"' dt 
f(v) becomes 
f(v) = (24) fe | yee UP) + Fe") dy 
where 
F, = f7. (FPF (te dt 
From Eqs. (4.5a) and (4.5b) we have that 
F(t) = —(2/r) f? Im (1/2nZ, e™ dy 
Therefore, 
F,+ F,* = —(2/m) So? dn Im (1/inZ,) fo? dt(e! + e4™) 
—4Im (1/ivZ,) = (4/r) Re (1/Z,) 


and 
f(v) = (2/) Je | Ye |'9' Re (1/Z,) dn (1.6) 
Thus, to second order 
Pam = Sum{l — Dox (wh) | Qua |? So?» | Yoone [7 Re (Z,)* dv} aa 
+ (mA) | Qum |? So” 9 | Yum | “Re (Z) dv 

Tor the special case that the coupling y is 

y=0 for t> 7/2 and t < —T/2 

y= 1 for -—T/2 <t< T/2 
lyre | = | See OP dt |? = 4(v — vy)? sin” (v — one) T/2 


— 2nTb(v — vax) for large 7 
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Then, 
Pasa = bam 1 — Soirgst 2T (Fivar) | | One |’ Re(Z,,,) 3 
+ 2T(firnm) | Qum |’ Re(Z,,,,) | for vam > 0 (1.8) 
= 0 for vam < 0 


Thus, it is seen that the P,zn now is proportional to the matrix element and to 
the dissipative part of the impedance Z,,,. Appropriately enough, no transition 
is possible to energy states such that v,., < 0 since the interaction system, being 
at zero temperature initially, can give up no energy to the test system. 


APPENDIX II 
A. INFLUENCE PHASE FOR EFFECT OF FREE SPACE ON AN ATOM 


As an illustration, the influence phase for the effect of free space on an atom 
will be calculated. This problem is more complicated than the idealized systems 
considered in deriving the formalism since the interaction here is of the form 
Q-X, Q and X being vectors rather than QX where Q and X are scalars. This 
difficulty could be overcome by writing the influence phase in tensor notation 
or by recasting the problem so that the interaction is of the form QX. The latter 
will be done to adhere more closely to the point of view of the derivations. Since 
a linear system is being dealt with, it is only necessary to determine a suitable 
coordinate for the atom and find the impedance function Z, for the effect of free 
space. It is assumed that the atom is made up of a system of particles of mass 
m, , charge é, , and position r, + 2x, where ry, is the position of the center of 
charge of the atom. If the transverse part of the radiation field in the box is 
expanded into a series of plane waves each representing independent harmonic 
oscillations (22), then the nonrelativistic Lagrangian for the complete system 
consisting of the atom and the field in the box can be written (2) 


L(X, X, de, Me, t) = La t+ yon Cn&n A" (ra + tn) 

+ Wd Veale?) — o°*(gk”?)”) 
where L, is the Lagrangian of the atom unperturbed by outside forces and 
A(X) = (84°)? 0x [ex(qecos(k-X) + gy’ sin(k-X)) 

+ e1(gg?cos(k-X) + gysin(k-X))] 


Here e, and e; are two mutually orthogonal polarization vectors, each orthogonal 
to the propagation vector k. Now we assume that the radiation field of the box 
is constant over the particle, i.e., that A(r, + 1r,) & A(r,), the dipole approxi- 
mation.” This permits one to replace >>, énXn by j, the current operator for 


(II.1) 


(11.2) 


28 This is equivalent to taking A(ka + x,), expanding it in a series of k-xn since this is 
assumed small, and keeping only those terms which keep the interaction term of the La- 
grangian linear. Since the interaction is of the form enxXn-A(k, + Xn) for the nth particle, 
then A can only contain constant terms. 
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the atom. In addition, even though e,-e, = 0, which fixes their relative orienta- 
tions, their absolute directions in a plane perpendicular to k are still arbitrary. 
Choosing e; so that 


e,j =0 (11.3) 


We assume the box to be very large so that >>, — (16°) 'f d*k, because the 
mode corresponding to k and —k is the same. Combining (II.1), (II.2), and 
(11.3) the following total Lagrangian is obtained. 


ae , 1 ‘ 
ne eS | d E (gy? — 2 aera?) 


r=1,3 16x 2 
3 
k ,. 
+ (8x)? ox (j-ex) gx” cos (k-r,) (IT.4) 
3 
+ (8)! [ FX (j-e,)oh” sin (ler) 
iT 


Thus, the number of each of the two sets of oscillators (g,’’ and g{”’) in a volume 


of k space d*k is d°k/16x°. The coupling strength of the gy” and gq,” oscillators 
with the atom is (81)'?(j-e,)gkcos(K-r,) and (8m)'/"(j-e,)g¢” cos(k-ra) 
respectively. If j is oriented along the 6 = 0 axis in polar coordinate representa- 
tion, j-e, = j sin 6. Then choosing 7 as the atom coordinate, the impedance for 
the two oscillators of frequency kc can be found from the rule found in Section 


IV to be 
(ivZ,(k))" = (1/7,)[(82)"” sin 6 cos (k- rage? + (82)'”? sin 6 sin (k-rage”], 


_ 8" sin’ 6 cos’ (k-r4) _ 8rsin’ 6sin’ (k-r,) 


~ Gi = BE GP BE ae 
= —8msin’ 6[(v — te)? — he} Jere 
The total effect of all the oscillators is 
(Z,) 1 = (160°) "fZ5" (k) dk == 16x") "fk? sin 640 de dk(Z,") 
= —t4v(8mrc°) fo OQ dQ[(v — te)? — QT" loo (11.6) 


(2v°/3c°) — t(4v/Bmc*) fo? Q dQ(v> — 9°) 


where the substitution 2 = kc has been made. Thus, the effect of free space is 
characterized by (7vZ,)’. The equivalent distribution of oscillators coupled to 
jy is 


G(Q) = (407/3xc’) (11.7) 


307 


308 


INTERACTION OF SYSTEMS 169 


B. SponTaANEoOuS EMISSION PROBABILITY OF AN ATOM IN FREE Space 

To compute the transition probability for this atom, we use second order 
perturbation theory developed in Appendix I for a system initially in state 
$n(X-) and finally in state ¢,(X 7) when acted on by an influence functional for a 
linear system at zero temperature. The expression is 


Pum = 2T | jam |? Fivnm) ' Re(Zinm) > Yam > 0, m  m 


(11.8) 
= 0, um < 0 
From Eq. (11.7) we find 
Re(Zi.n) = (4/2)G(vam) = 2Vim/3e° (11.9) 
Using this in Eq. (11.8) 
Pam = 4¥am |jnm | T/Shc* 
(11.10) 


= 4vine’| Xan |" T/3hc 
where to obtain the last, more familiar form, the substitution jam = VamXmn 
has been made. This is the first order spontaneous emission probability for an 
atom in free space. 


Now we can form an expression for the intensity of radiation per unit time. 
The power radiated from the dipole is 


hnakan! b = 2iee | Man lt ReCZien) 

= 4e| Xan |? vin/BC 
an expression which is almost the same as that for power radiated from a classical 
dipole. The expression becomes exactly the same if we apply the correspondence 
principle by replacing the matrix element of the time average of the coordinate 


of the oscillator by its corresponding classical quantity. Thus, if X is the coordi- 
nate of the corresponding classical oscillator (Xo is its maximum value) then” 


2 | Xmn |? > (X*) = 6X0 


26 That this correspondence is true can be seen easily as follows. Consider the dipole, a 
harmonic oscillator as above, to be in a high quantum state, ¢, . Classically the motion of 
the dipole can be described as X = Xo sin wt. We wish to relate the classical value of (X?) 
to its matrix element. In a quantum mechanical sense, 


(X82) = fon* (X)X%bn(X) dX = ffdu* (X)X Vedi (X) oe" (KX) X'Gn(X") dX dX! 


Since inatrix elements exist, in the case of a harmonic oscillator only fork = n—-—1,k = 
n+ 1, we have 


(11.11) 


(X?) = | Xan |? + | Xaenat |? 
For very high quantum numbers these two terms become nearly equal since 
| Xnwn—1 [? = nh/2mw, | Xangi 2 = (n + 1)h/2mw 


Thus, as n > ~ (X?) = 2| Xn.n-1 |?. But in the classical case, X? = 14X?. Therefore, 
| Xasn-t |? = Vy Xo?. 
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If | Xn |’ is replaced by Xo'/4 then Eq. (II.11) becomes the expression for the 
power radiated from a classical dipole. Our purpose, however, in doing this ex- 
ample, was to show for a specific problem that the effect of a distribution of oscil- 
lators interacting on a system is the same as the effect of loss on the system.” 
This has been done by relating the energy lost from the radiating dipole to the 
distribution. It is not surprising that a sea of oscillators should give this effect. 
If the dimensions of the box are allowed to be finite then energy emitted from 
the system under observation is reflected from the walls and eventually finds its 
way back to be absorbed again. This is equivalent to saying that the number of 
oscillators comprising the electromagnetic field in the box is infinite with a finite 
frequency spacing between the modes. Since the oscillators are independent there 
is no coupling between them and energy coupled into one of the oscillators from 
the test system must eventually return to it. If the dimensions of the box are 
allowed to get infinitely large, energy emitted from the test system never gets 
reflected and thus never returns. In oscillator language this means that the 
frequency spacing between oscillators has become infinitesimal, so close that a 
little of the energy absorbed by each one gradually leaks into nearby modes and 
eventually is completely gone. 


APPENDIX III. SPONTANEOUS EMISSION OF AN ATOM IN A CAVITY 


In this calculation as in the free space calculation the dipole approximation 
will be used in computing the spontaneous emission probability. The linear 
coordinates inside the cavity will be represented by the vector Q while the time 
varying coordinates of the single cavity mode being considered will be X(t). 
The Lagrangian of the system may be written 


Ly = L(Qn ’ Q,; t) + ape €nQn-A(Qe + Qn ) t) + Leavity (IT1.1) 


where Q, is the atom coordinate, Q, + Q, is the particle coordinate in the atom, 
and A is the vector potential of the cavity field. The interaction term is the one 
of interest, since from it we find the terms that we wish to solve for classically. 
This term will be put into more convenient form. Let us write 


A(Q, t) = a(Q)X(t) (III.2) 


27 If the power radiated from the oscillator is related to the classical expression }¢/?R 
then from Eq. (II.11) it can be seen that # is proportional to Re{1/Z(»)] which in turn is 
related to the distribution of oscillators. One might expect Im[1/Z(»)] to be replaced to 
the reactance seen by an oscillating dipole, a quantity which is known to be infinite classi- 
cally. From Eq. (II.7), 


Im(1/Z,) = 4v/3ec*f, 22%? — 22)-1 dQ = 4y0/3mc? |a-+0 


The integral is linearly divergent. This factor is also related to the infinite self energy of 
point charge which occurs both classically and in quantum electrodynamics. Here this di- 
vergence does not bothe> us since it never enters into the calculation. 
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where 

J a(Q)-a(Q) d'Q = 4nc’ (111.3) 
If A does not vary much over the atom, then A(Q,) & A(Q, + Q,) and the 
iuteraction term is written 


Cn enQn-a(Qp)X(t) = (j/e) | a(Qp) | X(t) (111.4) 


where 7 = (>. ¢.Q,)-a(Q,)/ | a(Qp) |, the component of the atom current in 
the direction of the cavity field. 
Let us now determine the ratio 


a(Q,) |X] = ivZ, (III.5) 


do 
classically, The wave equation appropriate for this calculation (high Q)” is 
VA — (1/c)A + (w/’Q)A — (4x/c)P = 0 (III.6) 


The atom is located at Q, and, since the dipole moment is induced, its direction 
on the average is the same as that of the field A in the cavity. We have then 


P = j8(Q — Qp)a(Qz)/ | a(Qp) | (III.7) 
Substituting (IJ1.7) and (III.2) into (ITI.6) we obtain 
[— wX + (w/Q)X — X]c*a(Q) 
— (4/c)j5(Q—Qp)a(Qz)/| a(Qp)| = 0 


where w is the resonait frequency of the cavity. Multiplying by a(Q), integrating 
over Q, ald taking Fourier transforms, (III.8) becomes 


(Y — w + tw/Q)X, — c'|a(Qp)| 7, = 0. (111.9) 
We find that the ratio (III.5) 
— w + (irw/Q)}c/| a(Qp)” 


The influence phase for this is (although it is unnecessary to write it) 


ae he FP gees 9). FG ay) 
&(7, 7 ) = af | (ivZ,) i, (—tZ_,) |e 


From second order perturbation theory we know 


(TIL8) 


? 


ir, = [v 


Pic = 2T | Gan \ Given)” Ré(Zan)” for’ ven D0 


28 Q is used here as the dissipation factor of the cavity, wl./R, while Q is a vector rep- 
resenting the linear coordinates inside the cavity. 
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Noting that 
Re(Zrnm) |) = wtam | a(Qp)|? Qe [vam — wo)? + w'vanQ ]” 
and defining a cavity from factor J? = V | a(Qp)|"/4mc" 
Pam = (8% | jam | fParaml’)(2VQ) [vim — wo) + w'vmQ”)  (IIT-10) 
At resonance this expression reduces to 
Pam = 88 | jam | [QT/RV vim (III.11) 
The quantity usually computed is the ratio of the transition probability in the 
cavity at resonance to that in free space. This ratio is 
Pam (cavity) _ [84 |jnm |'Qf°/vam VAIT _ 6x QP 


Pam (free space) ‘(4 | Jam [enm/ dhe] T Vv 


At resonance, the ratio increases with respect to Q as one might expect and 
decreases with respect to the cavity volume V and v3, . This expression agrees 
with the one given by E. M. Purcell (23) although the form factor in his calcu- 
lation was left out. This does not matter since for a particle located near the 
maximum field point in a cavity the magnitude of f is of the order of unity. 


APPENDIX IV 
It is to be demonstrated that 
G(X) = limy taese [[fe1 (1 + Xe) > explo. Xe] (IV.1) 


where the X; are small but not necessarily equal to each other, and where the 
total sum >_,X;, is finite. Rewriting the expression for G(X) we have (where 
the summations on all indices go from 1 to N) 


G(X) = Lt a Xe + Wier XiXe + (1/6) Doigeepee XXX + - 
= 1+ Da Xe + Wl XXL — Ge) + BY Dane XXX e 
X (1 — by — ee — 85 + 2d jn5,2) (IV.2) 


As N is allowed to get very large the contribution of the terms involving quan- 
tities such as 6,, becomes less significant. lor instance, in the third term 


Dose (XjXi)(1 — bn) ~ (NX)? — (NX)*/N (IV.3) 


and for very large N only the leading term in this sum is important. Thus, we 
have the result that 


G(X) igor Ut Da Xe + WHO Ke)? + MD)’ 


(IV.4) 
fm exp[ 2» Xi] 


Recerven: April 5, 1963 
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IV. Liquid Helium 


This commentary is supplied by: Alerander L. Fetter, Physics Department, Stanford University. 


Between 1953 and 1958, Feynman published a seminal series of papers on the atomic theory 
of superfluid helium. Superfluidity in liquid helium had been discovered in the 1930’s, and the 
early understanding of this phenomenon relied on Landau’s phenomenological theory (1941, 
1947) of phonons and rotons as elementary excitations (“quasiparticles”). In this context, the 
Bose-Einstein statistics of helium atoms (and the existence of a Bose-Einstein condensate) 
played essentially no role. [For a summary of the early history see, for example, A. Griffin, 
in Bose-Einstein Condensation in Atomic Gases, edited by M. Inguscio, S. Stringari, and 
C.E. Wieman (Italian Physical Society, 1999).] A significant part of Feynman’s central 
contribution was the demonstration that these phenomenological concepts arose directly 
from the fundamental quantum mechanics of interacting bosonic atoms with strong repulsive 
cores. 

One of his earliest helium papers [21] showed in detail how the symmetric character 
of the many-body wave function severely restricts the allowed class of low-lying excited 
states. Specifically, all such states correspond to density fluctuations (phonons), and he 
then proposed [24] the explicit model wave function yy, = pid, where (rj, r2,..., Py) is 
the symmetric exact ground state wave function of the N helium atoms with energy Eo, and 
oh = oN exp(ir; -k) is the symmetric operator that creates a density fluctuation. Here, » 
is real, nodeless, and vanishes whenever any two atoms approach each other closer than an 
effective hard core diameter, ensuring that ~, indeed incorporates the proper many-body 
correlations arising from the hard cores and associated excluded volume. In addition, 
is an eigenstate of the total momentum operator with the eigenvalue Nhk, so that states 
with different eigenvalues are orthogonal. Hence ~, serves as a suitable variational trial 
function that yields an upper bound to the excitation energy €, = E,—-Ep as a function of 
the continuous variable k. 

In this way, Feynman obtained the elegant and simple expression ex ~ h?k?/2mS{(k), 
where S(k) is the static structure function that had been measured independently (for ex- 
ample, with X-ray scattering). At long wavelengths (k — 0), S(k) — hk/2ms, where s is 
the speed of compressional sound waves, reproducing Landau’s linear quasiparticle spectrum 
Ee © hsk as k — 0. For large k, in contrast, S(k) — 1, but it has a peak at k ~ 2 Aot 
associated with the nearest neighbor separation; this latter feature produces a dip in the 
quasiparticle spectrum, qualitatively similar to Landau’s conjectured roton minimum. In 
practice, the calculated position A ~ 19.1 K of the minimum was roughly twice the value 
(9.6 K) that fit the heat capacity. Subsequently, Feynman and Cohen [32] improved the vari- 
ational trial function by including the hydrodynamic “backflow” associated with the rapid 
motion of a single atom through the fluid, and they found the much lower value A ~ 11.5 K. 
Finally, Cohen and Feynman [38] used their trial function to study the inelastic neutron 
scattering from superfluid helium, predicting that the scattering would be dominated by the 
excitation of a single quasiparticle, allowing a direct measurement of energy spectrum €, 
(subsequent measurements fully confirmed this behavior). 

Separately, Feynman [27] considered states with macroscopic superfluid flow, arguing 
that they must have the simple form Yaow = exp[tZ/L,s(r;)]¢, where s(r) is a function 
that varies only slowly over an interatomic distance. The resulting single-particle current 
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corresponds to a velocity field vu, = hVs/m, which agrees with Landau’s assertion that the 
superfluid velocity is necessarily irrotational. Feynman recognized that the single-valuedness 
of the many-body wave function implies the quantization of circulation ¢ v, - ds in units of 
2rth/m == h/m, as suggested by Onsager (1949). He then considered the detailed behavior 
of a single quantized vortex line, where the centrifugal barrier near the center creates a 
hole of radius a (a few atomic diameters); as a result, he proposed that bulk superfluid 
helium rotating at an angular speed w is filled by an array of such linear vortices with 
an areal density 2um/h. Finally, he suggested that the creation of vortices can account 
for the observed low values of the critica] velocity for the onset of dissipation. Landau’s 
original analysis of the critical velocity associated with the creation of a roton suggested a 
value of order 60 m/s, whereas the measured values were at least 100 times smaller and 
depended on the geometrical configuration of the fluid. For flow from an aperture of lateral 
width d, Feynman’s vortex mode] yielded uv, ~ (/md) In(d/a), in reasonable agreement with 
experiments. 

In fact, Feynman’s first substantial paper [20] on helium dealt with the 4 transition at 
T) = 2.17 K, which signals the formation of the new phase He II (and the onset of superflu- 
idity); this paper is necessarily quite different from those focusing on the low temperature 
behavior discussed above. Feynman expressed the exact quantum-mechanical partition func- 
tion as a path integral and then mapped it rigorously onto a classical polymer problem. The 
resulting picture of superfluidity near the 4 transition relies on the appearance of macroscopic 
ring exchanges that depend critically on the Bose-Einstein statistics. He showed that the 
strong interactions do not qualitatively change the transition temperature 7). This paper 
was far ahead of its time, as the detailed implementation of its theoretical program required 
the development of high speed computers. Modern path integral Monte Carlo techniques (in 
effect, following Feynman’s formulation) have yielded a quantitative theory of bulk liquid 
helium. [For a review, see, for example, D.M. Ceperley, Rev. Mod. Phys. 67, 279 (1995).] 
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It is shown from first principles that, in spite of the large interatomic forces, liquid He‘ should exhibit a 
transition analogous to the transition in an ideal Bose-Einstein gas, The exact partition function is written 
as an integral over trajectories, using the space-time approach to quantum mechanics. It is next argued 
that the motion of one atom through the others is not opposed by a potential barrier because the others 
may move out of the way. This just increases the effective inertia of the moving atom. This permits a 
simpler form to be written for the partition function. A rough analysis of this form shows the existence of a 
transition, but of the third order. It is possible that a more complete analysis would show that the transition 
implied by the simplified partition function is actually like the experimental one. 


INTRODUCTION 


HE behavior of liquid helium, especially below 
the \ transition, is very curious.’ The most suc- 
cessful theoretical interpretations,? so far, have been 
largely phenomenological. In this paper and one or two 
\o follow, the problem will be studied entirely from first 
pnnciples, We study the quantum-mechanical behavior 
of strongly interacting atoms of He‘. We shall try to 
show that the main features of these curious phenomena 
an, in fact, be understood from this point of view. 
Because of the enormous geometrical complexity in- 
volved, we shall not attempt to obtain useful quantita- 
live results, The quantum mechanics will not supplant 
the phenomenological theories. It turns out to support 
them, 
In this paper we begin the study of the statistical 
mechanics of the liquid? 
_London‘ has proposed that the transition between 
liquid He I and liquid He II is a result of the same 
Process which causes the condensation of an ideal 
Bose-Einstein gas. This idea could be criticized on the 
founds that the strong forces of interaction between 
the He atoms might make the ideal gas approximation 
(im which these forces are neglected) even qualitatively 
_—. 
'W. H. Keesom, Helium (Elsevier Publishing Company, Inc., 
Amsterdam, 1942). 
excellent summary of the theories of helium II is to be 
lound in R, B, Dingle, Advances in Phys. 1, 112 (1952). 
A preliminary report on this work has been published. R. P. 


Feynman, Phys. Rev. 90, 1116 (1953). 
F. London, Phys. Rey. 54, 947 (1938). 


incorrect. We shall argue that London’s view is essen- 
tially correct. The inclusion of large interatomic forces 
will not alter the central features of Bose condensation. 

The principal point is an argument which shows that 
in a liquid-like quantum-mechanical system the strong 
interactions between particles do not prevent these 
particles from behaving very much as though they move 
freely among each other. 

The exact partition function is first written down as 
an integral over trajectories, by using the space-time 
approach to quantum mechanics. The observation 
that the atoms move very freely among each other is 
then made. This permits one to write a simpler form 
(Eq. (7)] for the partition function. This form should 
be fairly accurate, at least qualitatively. It becomes 
clear that a transition is to be expected, and that it 
involves the symmetrical statistics in an essential way. 

On the other hand, the geometrical complexity of the 
problem still prevents us, so far, from giving a very good 
estimate of the free energy behavior near the transition 
point and below. A relatively crude approach gives a 
transition like that of the ideal gas. That is, the specific 
heat is continuous, contrary to the experimental ob- 
servation that it appears to be discontinuous. Some of 
the geometrical problems which might have to be 
solved to obtain a more satisfactory solution are dis- 
cussed in an appendix (see also reference 3). 

The crude approach should, however, be quite satis- 
factory a little above the transition point. So there is 


5R,. P. Feynman, Revs. Modern Phys. 20, 367 (1948). 
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no doubt that at least the existence of the rise of specific 
heat! of HeI on cooling toward the X point can be 
understood from first principles. 

At the opposite extreme of very low temperatures 
(say below 0.5°K), the situation again can be partially 
analyzed. This is done in the next paper.® 


EXACT EXPRESSION FOR THE PARTITION FUNCTION 


To study the thermodynamic properties we must 
calculate the partition function 


Q=Di exp(—BE)), (1) 


where B=1/kT and E; are the energy levels of the 
system. In this form the calculation appears hopelessly 
difficult because the energies EF; are eigenvalues of such 
a complex Hamiltonian H. The expression for Q is 
equivalent to the trace of the operator exp(—8H). In 
Eq. (1) the trace is written in a representation in which 
H is diagonal. We shall prefer to use the coordinate 
representation to describe the trace. 

To illustrate how this is done, we take the example 
of a one-dimensional system, of coordinate x and Hamil- 
tonian ?/2m+V(«x)=H: The trace of exp(—8H) is 
then O= f dz(z|e~8" |z). The matrix element (z| e~6# | 2) 
is similar in form to the matrix element (z|exp(—7H/ 
h)|z) which represents the amplitude that the system 
initially at =z, is at time ¢ also at the point x=z. This 
latter is’ the sum over all paths (signified by 
J+ ++Dax(t)] which go from z to z of exp(iS/h), where 
S is the action /o'(4ma?—V(x(2)) ]dt. If we replace 
it/h by 8, we are lead to expect 


ale*mlay= f exo|- f [=(2) 


+160) [du] Dow, (2) 


the variable u=it/h replacing ¢, and the various signs 
adjusted accordingly. The integral /”,, is to be taken 
on all trajectories such that «(0)=2 and «(6)=z. It is 
easily verified that Eq. (2) is exactly correct. The 
normalization of the path integral is to be such that 


Bm sdx\? 
is exp|— f (=) an}os( 
m 
= (m/2nrBh?)? exp| — te a} (3) 


where the trajectory ér’ runs from x(0)=2 to «(6)=2'. 
The integral of (2) with respect to z then gives the 
partition function, 

To apply this to liquid helium two modifications are 
necessary. First, instead of one variable, we have 3V 


®R, P. Feynman, Phys. Rev. 91, 1301 (1953). 
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variables which we take as the NV three-space coordinate 
vectors x; of each of the V atoms (i= 1 to ). We desig. 
nate the entire set of coordinates by x¥ and the integral 
over them all by d¥x;=dxidx,---dxy. The initial and 
final values of these we call z;. Secondly, Het atoms obe 

symmetrical statistics. The trace of exp(—8H) is to be 
taken only over symmetrical wave functions. This 
means that if the initial coordinates are x;(0)=z,, the 
final coordinates need not be the same, but may be 
some permutation of these (signified by Pz,). That is, 


awry fernf ool (2G) 


ade V (x:- x) |oa} omc, (5) 


where the integral f:,p is taken over all trajectories 
x;(w) of all the particles such that x,;(0) = zy, x:(8) = Pz,. 
That is, the final coordinates x,(8) may now be some 
permutation P of the initial coordinates z,;. The sum is 
taken over all permutations P and the integral over all 
configurations 2,. 

In Eq. (5), 2 is the mass of a helium atom, and V(R) 
is the mutual potential of a pair of He atoms separated 
by R. The forces between He atoms are, very likely, 
fairly accurately two-body forces, This potential is 
given by Slater and Kirkwood.’ There is a weak attrac- 
tion of maximum depth (of energy equivalent to &7 
at T=7°K) at radius about 3.0A. The atomic volume 
at the transition is (3.6A)*%. At 3.6A, V(R) is about 
equivalent to kT for T=3°K. There is, therefore, a weak 
attraction at the average atomic distance. There is a 
violent repulsion if the atoms approach more closely 
than 2.6A (V=O at 2.6A). 

The expression (5) is an exactS quantum-mechanical 
expression for the partition function (even though no 
imaginary unit i appears). We shall use it to develop 
a qualitative understanding of this function for liquid 
helium. 

The quality « is of course not the time. However 
shall obtain a vivid representation of (5) by imaginins 
that it is the time. We can say that at one time 0 the 
coordinates x;(0) of all the atoms form an initial ie 
figuration z,, and that as time u proceeds the partn 
move about [x;:(«)] in such a manner that at the ume 
8 the configuration of atoms appears to be the sme¢ 
(although in fact some of the atoms may have been 
interchanged). Each mode of motion is weighed by a 
negative exponential of the time integral of the enerRY 
required for the motion, and the sum is taken for fi 
such motions. Finally an average (or rather integ™ 
is taken over all possible initial configurations Zi. a 

We can see immediately that motions which ee 
too large a displacement in the time 8 have little wes 


O31 
J.C, Slater and J. G, Kirkwood, Phys. Rev. 37, 087 rors 
8 In so far as the forces can be represented as two-DOe. 
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pecause of the high kinetic energy required. Likewise, 
motions in which the atoms come so close as to appre- 
cjably penetrate the radius of their repulsion are of small 
importance because of the large potential energy which 
would result. For this reason, also, initial configurations 
for which the atoms overlap, that is, have centers so 
close that they would repel, contribute only a small 
amount. The method of approximation which we shall 
apply to (5) is to neglect the contributions from mo- 
tions xi(#) and configurations z; which give small 
contributions, and to study more carefully only those 
motions which give the larger contributions to the total 
in (5). That is, we shall have motions in which the atoms 
do not move too fast or far in the time 8 and in which 
the atoms never overlap. 

We emphasize again that these “motions” must not 
be construed as a real description of what the atoms are 
doing. It is simply a formal description of the expression 
for the partition function, An expression “‘the atom does 
not move too far in the time 8” does not refer to a real 
motion as # is not time, but is it/h. The true behavior 
of the atoms may have some analogy to the description 
of the formula (5), but such an analogy need not con- 
cern us here. Our reason to continue to call « and B 
“time” is to help to make our arguments as vivid as 
possible so that intuition will be most effective. 


THE CHARACTER OF THE IMPORTANT 
TRAJECTORIES 


Consider a particular motion in (5) in which some 
atom 4 moves to the site initially occupied by atom j. 
Call the displacement a=z;—z, The atom 7 must, of 
course, move to some other site to leave room for 1, 
The effect of the. motion of 7 we will associate with 
atom j. We study here the contribution to be expected 
just from the displacement of the single atom 7 by a 
distance a. 

Near the transition temperature displacements larger 
than about d, the atomic spacing (cube root of atomic 
volume 3.6A) are not very important [exp(—md?/28h?) 
=0.3 at 2.2°K ]. Nevertheless we will try to get an idea 
of the behavior for larger displacements. These will be 
useful at lower temperatures. Actually our considera- 
tions apply to displacements of any size, 

Suppose, then, atom 7 must make a translation a of 
length a. We make this, for example, to be nearly along 
a straight line. Our arguments will apply for any other 
route. 

The central problem is, what is the effect of the 
potentials of interaction on this translation? As a 
simple model which retains the essential features 
imagine the atoms as hard impenetrable spheres. We 
are, during a time 8, to move atom? from z; to z;=Z:ta 
and at the end to leave all the other atoms in their 
original positions. The atoms may not overlap at any 
time, 

There may be atoms in the direct line from z; to 
z;. Nevertheless, a moment’s reflection shows that they 
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will not offer a real potential barrier to the translation 
of atom i, 

It is evident that it is possible to place atom 7 at 
any position on the route from z; to z;, provided we 
readjust the positions of the other atoms to make room 
for it. In the readjusted positions the total potential 
energy of all the atoms can be made to be very nearly 
equal to the potential energy of the original configura- 
tion. Therefore, atom 7 can be moved to any inter- 
mediate position without violating any repulsive po- 
tential, in fact, without any appreciable modifications 
of potential energy at all. It is only necessary to move 
the other atoms around out of the way as atom 7 moves 
along. When 7 reaches the final destination z;, the other 
atoms (except 7 of course, as noted above) may all 
come back to their original positions (or to some per- 
mutation thereof?). 

The readjustment of the other atoms means that 
their coordinates x,(u) change with time. They con- 
tribute just kinetic energy in the exponent in Q. Beside 
the kinetic energy }m(a/8)? needed to move atomia 
distance a in time 8, we have also to add the kinetic 
energy of the readjusting atoms, This we can expect 
will also vary directly as the square of the velocity of 
atom 7. The net effect is an energy of the form 3m’ (a/8)?, 
where m’ is an effective mass, somewhat larger than the 
mass of a single atom m. The difference represents the 
effective inertia of the atoms which are readjusting. 

The time integral of the energy needed for readjust- 
ment varies with a and 8, as a?/8. This is clear for small 
displacements of 7, for then only a few atoms shift, and 
they do this with a velocity proportional to that of 7, 
For large displacements (a>>d) the same form, of course, 
results. To verify this, imagine that as atom 7 moves 
at velocity v(=a/f) the time it passes near a particular 
atom is of order d/v, This atom must adjust through dis- 
tances of order d in this time, or move at speed about 
v, The time integral of energy needed for this passage 
is about mz?(d/v)=mdv. The number of such atoms 
which must be jostled to move a total distance a is of 
order a/d. Thus the total time integral of energy of 
readjustment for all these a/d atoms varies as mdv a/d 
=mva=mea?/B. It varies with @ and 0 in the required 
manner, 

The effect of the other atoms is not to offer a poten- 
tial barrier (time integral varying as af) but a kind of 
kinetic energy barrier (time integral varying as a?/8). 
The effect of the interactions is taken into account by 
changing the effective mass of a given moving atom. 

To get some idea of the order of m’, we recall that 
the effective mass of a sphere moving through an ideal 
fluid of the same density is classically ()m, the extra ¢ 
being the energy of motion of the fluid making way for 
the sphere. The effect of the attractive forces may in- 


® The effect of a moving atom in permuting other atoms might 
have to be considered in more detail if we were to apply these 
ideas to the case of Fermi-Dirac statistics. It may mean that the 
m’ (discussed further on) is somewhat larger in that case. 
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crease this somewhat. We may expect m’ not to be very 
much greater than m—perhaps not more than 2 or 3 m. 

The effect of the relatively weak attractive potentials 
may be to alter the motions a bit, in that adjacent atoms 
tend to stick together a little. Thus the atom 7 may 
have a tendency to drag some atoms with it from time 
to time, possibly increasing m’ a little more. On the other 
hand, in He the zero-point energy” is high enough to 
shake these others loose readily. If the potential were 
much stronger the group attraction might become ac- 
cululative, raising m’ very much. It is possible that this 
would result in solidification. 

For short displacements a of order d, proportionately 
less adjustment need be made, so that it is likely that 
m’ may be somewhat less. For high velocities, it may 
represent less energy to violate the real potential re- 
strictions a little. Thus the readjustments need not be 
complete, so that m’ again may decrease a little with 
velocity a/8, approaching m as a/B>o. 

As this is meant to be a first approach to the prob- 
lem, we shall not attempt to calculate the m’. The geo- 
metrical complexity is very great. Further, we shall 
neglect the variation of m’ with a and velocity. It is to 
be expected that this neglect may not alter our con- 
clusions qualitatively. It is always possible, later, to 
include such finer details. Nor shall we discuss the 
variation (expected rise) of m’ with increasing density 
of the fluid. 

For every trajectory the atom acts like a free particle 
of effective mass m’, Hence we may take the integral 
over all paths x; for atom 7 to go a distance a to be 
proportional to 


(m'/2nBh*)* exp (—m’a?/2Bh?). (6) 


The normalization factor has been written as (m’/ 
2nBh)§ for convenience. That it varies as 6-} may be 
shown by dimensional analysis [compare Eq. (3)]. 
Actually a change in this factor will just change the 
partition function by a factor. It will be easiest to dis- 
cuss the normalization of the entire partition function. 

Therefore, we can approximate Q by 


Kgf m' \3xr oy 
oe cae J » exp X (ui~ Pas) | 
(7) 


Xp (Za, Zy°* + Zy)d% 2. 
The factor Kg we shall estimate later. The function 
p(%,***Zw)=p(z¥) represents a density associated 
with each configuration. It is discussed in the next 


10 The zero-point energy referred to appears in the integral for 
Q in the following guise. Suppose we restrict the motion of a cer- 
tain atom & so that, for example, it tries to move along close to i 
to take advantage of some extra attractive potential between & 
and ¢, Then the trajectories of & are restricted, and we lose a great 
deal in the integral over possible paths of k because we are not 
adding contributions from very many paths. We lose “volume in 
path-space.” This will suffice to offset the attraction if, in the 
conventional language, the zero-point motion is sufficiently large. 
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section. This expression (7) for Q may be rewritten, 
using Fourier transforms as 


Q= fro exp[ — (Bh?/2m')d k?]d"k;(21), (7a) 
where 


Fk) = KW) [YE expliE ke (Pa) 
Xp (2) dz, 


This form is especially useful near absolute zero, but 
we will not need it in this paper. 

It should be emphasized that the argument which 
leads to the free particle approximation for the motion 
of an atom is of greater generality. The argument re- 
sults simply from a consideration of the limitations to 
the true trajectories which result from the interatomic 
potentials. They therefore apply as well to the non- 
diagonal element of exp(—8#H), as to the diagonal ele- 
ment which appears in Q. Likewise, they are applicable 
to the true quantum-mechanical kernel, which is the 
nondiagonal element of exp(—itH/h). The imaginary 
weights in this case also restrict the atoms not to overlap 
at low energy, etc. 

This principle may have uses in other branches of 
physics, for example, in nuclear physics. Here there is 
the puzzling fact that single nucleons often act like 
independent particles in spite of strong interactions. 
The arguments we have made for helium may apply to 
this case also. 


THE CONFIGURATIONS OF IMPORTANCE 


Not all configurations zy are to be weighed equally. 
If z; and z; are closer than the distance at which strong 
repulsion sets in (2.6A) the configuration should he 
given very little weight [i.e., p in (7) is nearly zero if 
atoms overlap]. We shall discuss this effect first for the 
case of low temperature (6 large). 

Suppose initially in (5) two atoms overlap, say by 4 
distance «, and suppose that this results in an extremely 
high (relative to 1/8) potential V. If they move 4 dis 
tance x further apart, suppose V goes to 0, and take 
V independent of « for simplicity. During the interval 
u=0 to f, if the atoms remain overlapped for a time 7. 
the contribution is the negative exponential of Vz. This 
contribution is extremely small unless 7 is very snot 
(if V8>>1 then +8). The most important trajectories 
are then those that release V as quickly as possi os 
This can only be done by a high kinetic energy x/ ‘i ‘i 
Thus the integral of energy has the value mat] 200" 
+Vr, which is least if r= (m'x?/2VA?)3, in which cast 
it has the value (2m’V)tx/h. The contribution varies ea 
exp[ — (2m’V)4x/h]. This is just the quantum-mechan' 
cal penetration factor. In p it appears twice, for ae 
we must get into the overlapped condition at fhe nae 
of the interval 8. This argument fails if 8 does not es*r" 
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1, In that case a larger penetration results. It is due, 
af course, to the high kinetic energy that such a small 
3 implies. For the large repulsions 4 at low tempera- 
yures involved here this penetration is very small. ; 

In addition, p would not be quite uniform even if 
sere is No overlap of the atoms, and even if they are 
considered as impenetrable spheres. In fact, p would be 
iarger if atoms are well spaced than if they are nearly 
ydjacent (for large 8). If two atoms are adjacent ini- 
tially, the available paths are limited to those which 
move them apart—for they must not come to overlap. 
This decreases the effective volume of path space for a 
chort time. Actually this effect is offset partly by the 
;ctual attractive potential which results upon the 
Joser approach of the two atoms. Thus p represents 
the effects of short-time adjustments (times<f) while 
the longer-time effects!! are contained in the exponential 
iuctor in (7) [that is, expression (12) below ]. For large 
3. low 7’, p can be taken as nearly temperature inde- 
pendent, and the main temperature dependence comes 
irom the other factor (12). This p as B—0 is the density 
corresponding to the ground-state wave function. For 
approximate purposes we can take it to be simply the 
density function p for a classical gas of impenetrable 
atoms of diameter 6. That is, p=0 if any two z,’s are 
closer to each other than 6, and is 1 otherwise. This 
neglects the variations with distance due to the quan- 
tum-mechanical effect discussed above of restricted 
path-space volume, and due to the attractive part of 
the potential, 

lor high temperatures, the exponential terms in (7), 
representing diffusion, are unimportant, and p should 
approach the classical distribution function. Now the 
attractive forces are weak and unimportant so again 
can be roughly represented by an impenetrable sphere 
model. The radius 6 should be somewhat smaller. We 
shall neglect this variation of 6 with temperature. 

To summarize, p is qualitatively similar to the density 
distribution of a classical gas. It changes somewhat 
with temperature. 


PROPERTIES OF THE PARTITION FUNCTION 


A partition function has several formal properties, 
and we may test our approximate expression (7) to 
see how well it satisfies these conditions. Another im- 
important function is the nondiagonal element of 


'' There is a kind of distortion that takes a long time 7 lo 
release, namely, a general increase in density over a large area. 
This restricts the path-space volume for each atom in the area and 
results in a factor ¢~8” for each particle, where E is the excess 
energy per particle induced by compression. The energy E can 
only be released by moving many particles, distributed over the 
area. These density fluctuations are sound waves. If the wave- 
length of the fluctuation is \=2/K, the time needed to release 
it is r= 1/hw=1/heK, where c is the speed of sound. This exceeds 
Bif X>2rkT/he or X>2xr-8A for 2.2°K. (This exceeds the diffu- 
Sion distance d~3.6A even at 2.2°K, so will not be of concern to 
us near the transition.) Thus (7) is incomplete in that it does not 
correctly describe long wave-sound fluctuations, This matter is 
discussed in a subsequent paper. 
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exp(—8H). That is, Ga(2’, z)= (z’|exp(—87)}2), or 


Go(2'™, 2) = (WNL os exo|— f (Z2(Z) 


+2 ¥ (xx) du}, (8) 


the integral being taken over all trajectories ¢rp’ such 


that x;(0)=2,, x;(8)=Pz,’. The final configuration z,’ 
may differ from the initial configuration. Of course, 
O=SGa(2%, 2*)d¥ 2;. 

For large 8 we have given an argument for behavior 
of the function p(z¥), which represented it as the square 
of a function, say ¢(z*). One factor was for leaving an 
unfavorable (say overlappping) configuration. The 
second was for entering it again. At low temperature 
o(z*) is the ground-state wave function. The same 
arguments give for G,(z'%,z%) the approximate 
expression, 


Ga(z'%, 2”) =K,(N!)3 » $(z'%)g (2%) 


, 


m! m 


horace eee Gy “)? 
| ae 20S 


ye (9) 


Since exp(— 8,7) exp(— 82H) =exp[ — (6;+82)H], we 
have the condition of matrix multiplication, namely, 
that 


Gay +82(z/%, 2%) = fence, ZN’) Gp (2%, 2% )dN a", 


This requires that (8=6;+:) 


m’ 3N/2 m’ 
K ( ye ex [Pa | 
*\ onBh eed MT or ) 


, , 


m 3N/2 m 3N/2 
wrt") 
2rByh? 2rBoh? 


, 


m 
ex | - ree z,;’)P?— 
i 2p? 3 2B2 


goes 


PP! 


m 


‘ 
sh e'n'—2| 
P 


xLo(e"®) Pan”. 


Now, in the exponent we can relable z (by permuting 
the names) as P’z, since the sum in on all 7, That is, 
(Pal — 24’ = (PP’2;— P’2/'. Now call PP’ 
=P” and the sum on all P is equivalent to a sum on all 
P”. Finally, since z,’ are variables of integration in a 
symmetrical factor [(¢(2%) P= p(2"%) = p(P’z""") for 
any P’, the z’*™ integral does not depend on P’, and 
dip just gives a factor V!. 

Now the weight function p(z’””) prevents various z/’ 
from being too close together. If, however, 8; and 2 
are so low that exp(—m’d?/28;h") is fairly close to 1, 
the variations in the exponents due to this restriction 
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is very small. [That is, in the integrand, p(z’’*) varies 
rapidly while the other factor is smooth. ] Therefore, 
we replace p(z”*) by an average value and integrate 
over all z’’%. The effect of p(z’) just restricts the 
volume available to the configuration variables. Let us 


call 


Vr f o(er)aas, (10) 
so that the average value of p is Vy/V%. The integral 
on z,” is now easy, and we find 


Kay+6.= KayKaVy/V%. 


[This verifies our choice of the 8-! dependence in (6). ] 
This means that Kg must have the form 

Ka= VXV y7le-8%, (11) 
where Ep is a constant. Such a constant means, in Q, 
a constant energy (the energy at absolute zero). We 
will not try to determine this energy. Let us measure 
energies above this as a zero level. Then it can be ig- 
nored. Our final partition function is then (17) with 
Kp=V*/Vy. 

For extremely large 8, Q should approach 1 from its 
original definition as >0.e~8¥? and the choice Ey=0. 
For such large 8, the sum on permutations P means 
that Pz; goes successively over every site, while the ex- 
ponential exp[— (m'/2Bh*)(z;— Pz,)?] varies smoothly. 
It is approximated by writing it as exp[— (m’/2Bh?) 
X (z;—2,')*] and integrating over all z,/ but dividing 
by the atomic volume Va= V/N for each z,’. However, 
this does not take into account the restriction that all 
the z,’ are on different sites. So an additional V!/N% is 
needed. Thus the total factor is V!/(NVa)¥ or NV/V%. 
The integrals give (218h?/m’) per degree of freedom, 
so we see that Q approaches 


O~(K4/¥") f p(a®)\d¥a.=1 


as Bo, as required. 

This value of Kg was obtained by an argument in- 
volving large 8. Let us study its behavior for small 
values of 8. For small 8 (high 7) no permutation is im- 
portant in (17) except the identity. For no atoms are 
closer than b=2.6A, and (m'/2gh*)(z;— Pz;)? would be 
at least bm'/2ph?, if Pz; z,:. For small 8 this results in 
a large negative exponent. Thus Q approaches the value 


Q= Ka(N!)7 (m'/anpne)»” { (2%) d" 23. 


The correct limit, according to the classical theory, 
should be 


Q=NI (m/2maneyen [ (2%) an 


Since m’ approaches m, Kg must approach 1 as 60. 
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This means that Kg must be a function of 8 which 
varies from V*/Vy to 1 as B varies from large to small 
values. 

An accurate quantitative analysis of this problem 
would require close attention to the variation with 
temperature and density of m’, of K and of 6 (or, more 
completely, of p). 


EXISTENCE OF THE TRANSITION 


We can use this partition function (7) and the ideas 
associated with it to understand many of the properties 
of liquid helium. The behavior of the liquid at very low 
temperature (below 0.5°K) will concern us in the fol- 
lowing paper.® Here we will study the behavior in the 
region of a few degrees and shall show that a transition 
should occur. 

In the qualitative study of such a transition we nee! 
not concern ourselves with the continuous variations 
in the effective constants m’, Kg, 6. It might be well 10 
remark, however, that 2.2°K the expression ma?/2@h? 
is unity for x= 3.4A. This is just the order of the averaye 
spacing of the atoms. Therefore, we are not going to be 
involved in very long displacements, and it may be that 
m’ does not differ too much from m, 

It is not lard to understand that (17) gives a transi- 
tion. If p were a constant it would be the same as the 
partition function for an ideal gas. The fact that p is 
not perfectly uniform cannot change this much. 

To see in more cletail how this transition arists. 
consider the factor 


Dp expl— (m'/2Bh)D (2;— Pz;)*] 


in the partition function (7). 

Each permutation may be divided into cycles. A 
cycle of length s is a chain of permutations, such as 
1 goes to 2, 2 goes to 3, 3 goes to 4, etc. until s—1 goes 
to s and finally s goes to 1. Such a cycle contributes to 
a term in (12) the factor 


exp — (m’/2Bh*) (242? + Za3?+ °° - +2.%)], (13) 


where 2,;= 2,;— 2%; and 2, Z2, etc., are the positions of 
the particular atoms in the cycle. The total contriln 
tion from a given permutation is the product of al 
these factors, one from each of its cycles. For a given 
configuration we are to sum such a product over all 
permutations, that is, over all possible ways of layne 
out cycles on the configuration. es 

Consider a permutation of a certain “type,” that 3s. 
having a certain number of each kind of cycle. That 
P has 1, cycles of length 1 (ie., 7; atoms are not Per 
muted), #2 cycles of length 2, ---2, of length s, e!¢- 
The total number of atoms is JV, so that 


N= ,sn,. 


(12) 


(145 


“yun 
To these cycles there correspond ”; atoms, 72 poly PU 
of 2 sides, 3 triangles, ---, etc. drawn on the configu’ 
tion, and each contributes its factor (13). 
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Next we may sum over all permutations of the same 
type: This means that each polygon will change its 
chape and location (but not its number of sides) as we 
go from one permutation to another in the sum. Eventu- 
ally a given polygon can be considered as taking up all 

ossible forms—that is, a polygon of s sides will have 

had every possible set of s atom sites for its vertices.” 
4sa given polygon changes, of course, the others must 
change too, for no atom may be a member of more 
than one polygon in any given permutation P. This 
presents an enormously complicated mathematical 
problem. aera 

We shall try to simplify it by an assumption that 
the various geometrical forms that a given polygon 
can take are roughly independent of the shapes of the 
other polygons. That is, we shall assume that the con- 
tribution of a polygon of a given size is the average for 
such a polygon over all possible forms the polygons can 
take without restriction, That is, we assume the average 
factor contributed by a given polygon does not depend on 
what type the other polygons are. This assumption is 
probably not sufficiently accurate to give an exact 
description of the order of the transition. 

We shall actually use, for the contribution of a poly- 
gon, the total effect it would have if it were alone. In 
the various integrations over many polygons, the fact 
that no atom may be used twice actually restricts the 
volume of configuration space. It is Vy [Eq. (10) ]. To 
include this effect we will have an additional factor 
Vy/V%, 

Making these assumptions the total contribution to 
(12) of all permutations of a given type is 


VuV-*C(ny, Mar) fiTfon?- + fM--, (15) 


where C(m4, no-++)=N!/[]ane!s™* is the total number 
of permutations of a given type, and f, is the contribu- 
tion of a polygon of type s, for a given configuration 
calculated as though it were alone. We may average 
this over the possible configurations also. That is, 


m’ 
Le vf exo Coutt 7 a +14°)] 


Xp) (44, Zor Ze)dZ2°--dZ,, (16) 


where p“*)(z\---z,) is the chance of finding s atoms 
with their centers at 2, Z2, °**Z,- That is, 


Ber --)= f(a Zay °° * By; Zap?" ZN) GZ—41° * -dZy, 


where p(z%) is the configuration density of (7). The 
factor V comes in (16) from the fact that z, can be 
anywhere and has been integrated out. 


" Actually the only contributions (near 2°K) come from poly- 
gons formed from nearly adjacent atoms. The factor (13) is very 
small if any of the sides are very long. The polygons of importance 
may be of any total perimeter (for large s), of course. It is only 
their individual sides which are limited. 
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Next we must sum over all permutation types. That 
is, over all values of 21, n2,--- subject to (14). Signifying 
this by 5°’ and substituting into (7) we find 


Q= (KaVn/V%) (m'/2nph?)*? 21’ TT. (fa%e/nals™). 


The factor KgVy/V% presumably varies exponentially 
with NV, We could write it as exp(Va) where a is inde- 
pendent of N and varies slowly with temperature, 
vanishing as T—30. It will make no essential difference 
in our study of the transition (it just adds NRT to the 
free energy A) so we will not bother to carry it along. 
The sum is very difficult because of the restriction 
(14). However, we may use the usual methods of steep- 
est descents. We multiply Q by a factor of the form 
exp(uN/kT) (u is the chemical potential) and sum 
over NV. If we then put N= >°,sn,, we can sum on all 
n, without restriction. Further, if the free energy A is 


A=—kT 1nQ 


and the sum is written exp(—B/kT), we can deter- 
mine A from 


A=B+uN (18) 


N=—0B/dpu (19) 


in the usual way. (NW is the mean number of atoms.) 
Hence, putting 


x= (m! /2nBh?)! exp (u/kT), 


we can write 


exp(—B/kT)= D0 x® [](f2"*/ne!s) 


and 


(20) 


all n,’s 8 
=i x (fer*xi"*/ng!s") 
=iT exp(f.x*/s), 
or 
—B=kT > f.xt/s; (21) 
and (19) gives 
N=D Af x. (22) 


This pair of equations together with (20), (18) deter- 
mines A and thereby all the thermodynamic functions. 
The x is determined from the second Eq. (22), by the 
condition that N equal N, the actual number of atoms. 

To proceed further we shall have to evaluate f,. This 
we do approximately, for the calculation f, from (16) 
is difficult. The distribution p“ does not permit atoms 
to be too close together. This is important for atoms 
adjacent in the polygon, such as 1 and 2. On the other 
hand, it is not of great geometrical importance for links 
much further apart (like 1 and 5). The important poly- 
gons correspond to random walks of s steps from each 
atom to a neighbor, finally returning to the origin. In 
three dimensions the chance, after a few steps, of 
coming back to the origin before the final step is not 
large. In averaging (16) over the polygons, if we include 
self-crossing polygons in the average we may not be 
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far off. There are not many of them so they probably 
do not alter the average very much. This is a second 
assumption. It is similar to the first. To be more ex- 
plicit we shall approximate p‘*) in (16) by 


*Ze) = p(ta2)p (22s) °**p(Ze1), (23) 


where (212) is the probability per unit atomic volume, 
of finding an atom located at z, if one is known to be 
at 2 (and 232=2Z1~ Z2). It is a function only of the 
radial distance f(r), 7?= 21.2", which approaches unity 
as r gets beyond a few times the atomic spacing d 
[p(212) is proportional to p (zy, z2).] 

Thus, approximately, 


pi?) (t1, Z2, as 


sae vf expl— (m!/2Bh?) (ta? + tas+ --++-241’) ] 


X p (212) p (Z23)- ’ + p(Zs1)dZ2° +°dZe. (24) 
This formula is wrong for s=1, for, of course, fi=V. 
For s=2, (24) is not very good, for (24) averages with 
weight p(212)* while the correct weight in (16) should 
be £(Z12). Short rings are not important in determining 
the existence nor order of the transition, however. 

The expression (24) is nearly in the form of a con- 
volution and can therefore easily be simplified. If the 
last point of the polygon were not 1 but some other 
location, say Zp [i.e., replace Z.1 by Zs in (24)], the 
expression (+V) would depend on z)— 2 or 219. Call 
it g,(Z10). Its Fourier transform, /-g,(210) exp(2K- 210) 
Xdz1o, is the s power of the Fourier transform, 


1'(K)= f exp (— m'2?/2Bh)p(z) exp(tK-z)dz. (25) 


Therefore g.(z)= f/ exp(—iK-z) (I'(K))*dK(27)-, and 
since f,=g.(0), we find 


fr=V f T(K)*dK (27)-. (26) 


This is not true for s=1. For s=1 the true f; is V, 
while this gives fp=V ST (K)dK(27)*=Vp(0) from 
(25). This p(0) should be practically 0, but for gen- 
erality we retain it. Substitution of this into (21) we 
get Gi=V) 

—B=kIVY f (xT (K)*/s)dK (2m) ATV x, 

=— 
or 
—B=arv [ In(1— «I (K))dK (2)? 
+kTV[1—p(0) ]x; (27) 


and, similarly, 
N/v= if (t—ar(K) dK (2m)*4+[1—pO)]e. (28) 


To study the transition more closely, we study the 
effects of the longer cycles. We need f, for very large s. 


FEYNMAN 


Since '(K) in (25) falls as K rises and is maximum for 
K=0, we expand InI'(K) in powers of K? and carry 
only the first two terms in the form 


InT' (K)~Ind— }w?K?. (29) 


Here 


= f exp(—m'rt/2pH% p(r)snrtar, (30) 


and 


1 
Wms if ? exp(—m'r?/261)p()4artdr. (31) 
Then for large s, asymptotically, 
fo" f exp(—foutK) dK (2x) += VAs/s3, (32) 


with A= (27w*)-?. If we use this asymptotic form for all 
s>1, we make errors for the first few terms. But the 
transition occurs because of the character of the con- 
vergence of the series for large s. Therefore the pre- 
dicted character of the transition may be found by 
studying the sums (21), (22) with the asymptotic form 
(32) for f,. For example, the N/V sum is [putting 
p(0)=0] 


N/V=A > (6x)#/s!4+- (1—A8)x, (33) 
s=] 
and the expression for B is 
—B=kTV[A > (6x)*/s°?4 (1—Ad)x]. (34) 
s=1 


The same situation exists here as for the ideal gas case. 
The sum in (33) cannot exceed 2.612 (for «=1/8) and 
2.612A+ (1—A68)/5 may be less than the actual V/V 
desired. This will not occur for high temperature (6 
small), but on lowering T the difficulty suddenly sets in. 

What one must do, as is well known,* is to note that 
the dK integral in (32) is really a sum over the values 
of K which fit in the box of volume V. The lowest state 
(K=0, using running waves) is distant dK= (2m)*/' 
from the next. It suffices to sum on this one and 1n- 
tegrate the others. Thus we should add a factor 
1+ (2m)?V-8(K) to the integrand of (32),! so that the 
sum is more closely 


fa= VA5*s-3+68, 
and (33) becomes 
N/V=A > (6x)8s-#4 (1—A8)x+ V1 — x). 


s=1 


(32a) 


(33a) 


Now 6x can become very nearly 1, to order 1/V (e.g.. 
put x=1—1/gV) and the sum in this region is N/) 
=2.612A+(1—Ad)d6"'+g which can be satisfied for 
proper choice of g. For higher temperatures (smaller 5) 
we can use the original expansion (33). This change '" 


8 The validity of this procedure has sometimes been otk 
tioned. A method of arriving at the result (32’) which avolds 
use of this procedure is given in the Appendix. 
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vior for the two regions of T reflects in B (34) 
eg phase transition, This shows the existence of the 
» nsition. As temperature falls 6 in (30) rises without 
"alte Since p(r)~1 for large 7, as B rises 6 eventually 
vehaves aS (2rBh2/m')t, Likewise, eventually w? be- 
ee ph2/m’, so that A= (2qw*)-! approaches 6-', 
‘ynerefore 2.612A+ (1—A8)5-' tends toward zero (as 
ri) and must eventually fall below ’/V. Actually by 
putting in very reasonable values for the parameters, 
:, js easy to obtain a transition at about the right place. 
, If we do not use the asymptotic form (29) for I, 
.othing is fundamentally changed. In (28) the integral 
in K should have its factor 14 (2r)'V-'6(K) as ex- 
olained, and the equation to determine x becomes 


calling I'(0) = 8) 


w= i [1—aP' (K) }'aK 2") *+[1—p0)]e 
+V3(1— 8x), 


relia 


(36) 


Above the transition the last term is not required. 
Below, 6e=1—1/gV and N/V=/[1—6"T(K) }'dK 
X (20) °-+[1— p(0) ]-'4-g, with analogous expressions 
jor B. 


RELATION TO EXPERIMENT 


It would not be worth while to substitute numbers in 
these expressions aS too many small approximations 
have been made. In addition, it is difficult to estimate 
m'. The function p(r) might be calculated roughly for 
the smaller r by using the corresponding function re- 
quired for the quantum-mechanical second virial co- 
efficient. This assumed that any two colliding atoms 
are independent of the others. Alternatively, p(r) could 
be taken experimentally from x-ray or neutron scatter- 
ing data. 

On the other hand, the formulas (27), (28) have even 
qualitative faults when compared to experiment. They 
predict that helium, like the ideal Bose gas, would 
show a third-order transition (specific heat continuous 
but discontinuous slope). The experimental data! do 
not agree (apparently the specific heat is discontinu- 
ous). This disagreement probably stems from the 
neglected geometrical correlations among the rings." 

In order to study this in greater detail, it was thought 
that a careful study of the situation at extremely low 
temperature would be of value. The character of the 
transition must depend on an accurate description of 
the phase into which the liquid changes as it cools past 
the A point, This phase is represented in an extreme 
form near absolute zero. In this region Eqs. (27), (28) 
fail very badly. They predict a specific heat varying 
as T? while experimentally it varies as T°. In the follow- 
ing paper‘ we shall see that this discrepancy is a result 

“Assumptions about the temperature variation of the pa- 
rameters m, p(r), Kg cannot alter the order of the predicted 
transition. The effect of modifying (7) in the manner indicated 


in footnote 11 is discussed in the paper to follow. The change, if 
anything, is in the wrong direction. 
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of the error produced by the geometrical approxima- 
tions made in passing from (7) to (27), (28). The ap- 
proximations here permit much larger fluctuations in 
density than is available to the true liquid, and this 
qualitatively alters the behavior of the specific heat at 
low temperature. 

The experimental specific heat curve shows! a slight 
rise in the He I region as the point is approached from 
above. This is also a property of our expression (34). 
In this region only a few chains are starting to form. 
The restriction that no atom may be in more than one 
chain is not yet of importance. Therefore, in this region 
our geometrical approximations should be valid. The 
very least we can say, then, is that the rapid rise in 
specific heat of He I with falling temperature is com- 
pletely explained. 


SUMMARY 


Starting with an exact quantum-mechanical parti- 
tion function, we have derived an approximate expres- 
sion [Eq. (7) ] which should be qualitatively accurate, 
It has been shown to be in agreement with experiment 
in predicting a transition which depends in an essen- 
tial manner on the statistics. 

Further mathematical approximations have not been 
accurate enough to show whether (7) will correctly 
predict the order of the transition and the temperature 
dependence of the specific heat near absolute zero. 
They do suffice at high temperatures to show the rise 
in specific heat of He I as the transition is approached. 

It is proposed that a more careful analysis of (7) 
would show more complete agreement with the experi- 
mental facts.! In the next paper® the situation near 
absolute zero is studied in detail, and it is found that 
(7) (corrected for the effect mentioned in footnote 11) 
very likely does predict the correct behavior in this 
region. 

The physical idea which plays a central role is that 
in a quantum-mechanical Bose liquid the atoms behave 
in some respects like free particles. 

The author appreciates conversations with Edward 
Kerner and with M. Kac, as a result of which he became 
interested in the problem. He also is grateful for dis- 
cussions with E. Wigner, H. Bethe, and R. F. Christy. 


APPENDIX 


We give here another derivation of the approximate 
partition function (33a), (34). It has the advantage of 
showing more clearly the origin of the transition. We 
will treat it in a very approximate manner as we have 
already given more complete formulas. (It is the first 
derivation that the author made.) 

Near the transition only permutations involving 
shifts of each atom to the position of its neighbor, at 
some mean distance d, are important. The exponential 
factor from such a shift is y= exp(— m’d?/2Bh?). 

Each permutation can be broken into cycles. We now 
only count those cycles for which all atoms are adjacent, 
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forming a closed chain or ring. If a ring contains s 
atoms, its contribution is y*. If we have mz rings of 2 
atoms, 3 of 3 atoms, etc., the contribution is y?"*+s+---, 
The part of the partition function which determines 
the transition is then 

q= > G(ne, n3, °° » )yZertans, (1-A) 
where G(nz, m3, °+ +) is the number of ways we can lay 
out polygons, nz of 2 sides, m3 triangles, etc., on the 
configuration—the sides of the polygons consisting of 
lines joining nearest neighbors (length d). The sum is 
restricted for the number of single atoms n,;=N 
—> 25”, must not be negative. 

Here again we shall make the error of neglecting the 
geometrical interference of polygons due to the re- 
striction that each atom bea vertex of only one polygon. 
We shall include the competition among the polygons 
for the total number of available atoms by saying that 
there is an average probability ¢ that any site is un- 
occupied. This ¢ will later be determined so that the 
average number of atoms occupied is N=N. (This can 
be done in detail by steepest descents, but it amounts 
to the same thing.) Therefore, instead of g we calculate 
tNq and call it exp(—B/AT). 

The polygons can now be considered as independent. 
If RX, is the total number of s gons that can be drawn 
on the configuration, each s gon can be chosen in R, 
ways, and all n, of them in R,"*/n,! ways. The n, single 
atoms can be chosen in N"!/n,! ways. Thus 


exp(—B/AT)= XT Rs"*(n,!)~ tytn (ny!) 
Nl NQeo-+ sme? 
= exp[ t+ 2. Ra(ty)*], 
or 


—B=kT[NtHH+D R, (ty)*1. (2-A) 
om? 


The average number of atoms NW used in sites is ¢(dq/ 
dt)q-', so we have 


N=Nt+Dd sR,(ty)*. (3-A) 
o=2 

Now we calculate X,. We call /, the total number of 
ways that we can, starting at an atom and making suc- 
cessive steps to adjacent atoms, return after exactly s 
steps. This forms a closed s gon. We may start at any 
of the V atoms. The total Vk, measure the total number 
of s gons, but counts each s times (for you could start 
at any of the s atoms as the “‘first’’). Hence we have, 
R,=Nh,/s, and 


—B=kTN[t+D h.(ty)*/s], (4-A) 
=? 


where ¢ is a parameter determined from N=N, in 
(3-A), or 


1= +h h. (ty)*. (5-A) 
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We shall bother to determine the form of &, for large 
s only. In the random walk, each step is length d, and 
may be made to any of the adjacent atoms, which we 
shall say are / in number, on the average, Each step 
can be made in / ways, the entire s steps in /* ways, 
There are /* walks in total but only a certain number 
return to the origin. As is well known, the probability 
of being at a given point, per unit volume, radius r from 
the starting point is 


(2md*s/3)~* exp(—3r°/2sd*) = As? exp(—1?/2sw?), 


putting w*=d?/3 and A= (2rw*®)-!. The chance we are 
back at the original atom (that is, within a space of 
one atomic volume Va=V/N near the origin) js 
V.As-}, so that 


he= Vadse. 


The reason for the dependence on the s~? is easy to sce. 
After s steps we have wandered out to a mean radius 
of order s}, or over a volume s!. Hence, the chance that 
in this volume we are back at the original atom varies 
as s~#, This is correct except for enormous chains 
sSN}, which are long enough to wander all over the 
liquid. Then the available volume no longer increases. 
The chance that one is back at the original atom instead 
of one of the other roughly equally likely atoms is 1/.V. 
For such large s, then, 4;=/*/N. The formula, 


hy= (Vads#41/N)I, (6-\) 


takes care of both cases, because for small s the first 
term dominates, and for large s the second takes over, 
as it should.?® Substitution into (3~A) gives 


1=t4+Vsd > (lyt)s + N7(1—lyty. (7-\) 
om? 


This may be compared to (33a). To do so, note that 
our interpretation of 3w* as the mean square length of 
a step agrees with (31). Further ly is the number of 
atoms available per step, /, multiplied by the weight 
y= exp(—m/’d?/2Bh). In the more general case it be 
comes 6/V 4 as an inspection of (30), the expression for 
6, shows. Finally, the parameter ¢ can as well be called 
Vax, and Eq. (7-A) is seen to be identical to (33a) 
(times V4). Likewise, substitution of (6-A) into (4-™) 
gives the corresponding equation (34) for B. 

This derivation throws some light on the mathe- 


18 According to (6-A) the chance to return is higher than for os 
infinite medium. It might be objected that there should be few bas 
paths available when there are a finite number of atoms MN 
enclosed space. What we have done corresponds to working uh i 
a periodic boundary condition, and the excess arises tron {to 
chance to return to one of the images of the origin, instea‘ haat 
the origin itself. With the more physical boundary asso eek af 
that paths cannot cross the liquid surface—the total num tually 
paths for high s is not /*, but is reduced. It becomes sip mare 
proportional to ¢°/*, where ¢ is a very small number i os 
N-1. It makes no essential difference in the result. In is € the 
mentum representation of the text it corresponds to taking the 
lowest state to have a wavelength controlled by the Size sesible 
box. Iam indebted to Herman Kahn for pointing out this po 
objection to (6-A). 


» TRANSITION 


atical “cause” of the transition. As T falls y rises, 
a i] the enormous number of possible orientations of a 
ai long ring more than compensates for the small 
ventribution of each (y'<<1, for / large). 

There is no doubt of the geometrical fact of large 
numbers of orientations for long rings, even if these 
rings may never use the same atom twice (i.e., cannot 
cross themselves).!® Therefore there can be no doubt 


roa eee . . : 

\s For example, the number of ways in which a single polygon 
hich does not cross itself can still be oriented in an infinite 
medium = constant X/*s4, but the value of / is reduced. 
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that (1-A), and its more complete expression (17), will 
show a transition from this cause. But the order of the 
transition need not be the same as that of the approxi- 
mate evaluations we have made. They neglect the geo- 
metrical correlations. For example, if a large chain of 
K atoms is already formed, are the remaining N—K 
atoms more (or perhaps less) likely to be contiguous 
and therefore more easily able to make other chains, 
than if these ’— K atoms were chosen at random from 
among the NV’? Our assumption in deriving (5S-A) was 
that it was equally likely either way. 
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The properties of liquid helium at very low temperatures (below 0.5°K) are discussed from the atomic 
point of view. It is argued that the lowest states are compressional waves (phonons), Long-range motions 
which leave density unaltered (stirrings) are impossible for Bose statistics since they simply permute the 
atoms. Motions on an atomic scale are possible, but require a minimum energy of excitation. Therefore 
at low temperature the specific heat varies as 7? and the flow resistance of the fluid is small. The arguments 
are entirely qualitative—no calculation of the energy of excitation nor of the low-temperature viscosity 
is given. In an appendix an expression, previously given, for the partition function is modified to include 


the effects of phonons. 


INTRODUCTION 


ISZA! has suggested the very fruitful concept 
that He II might be thought of as a mixture of 
two fluids, “superfluid” and “‘normal.’”’ At zero temper- 
ature the helium is pure superfluid. With rising temper- 
ature some sort of “excited molecules” form. These 
constitute the “normal fluid” which behaves very much 
like a gas. The proportion of normal fluid increases at 
first slowly, and then rapidly, with temperature until 
at the transition temperature of 2.19°K (A point) the 
liquid, now He I, contains no more superfluid. 

Landau? has made even more detailed suggestions. 
He suggests that there are two kinds of “excited 
molecules,’ phonons or quanta of longitudinal com- 
pressional waves (sound) and “‘rotons.”’ The latter are 
not well understood. It is suggested that they have a 
minimum energy A needed to excite them. For this 
reason below 0.5°K there are practically only phonons, 
The rotons can become excited when more energy is 
available; i.e., at higher temperature. This idea is in 
agreement with the fact that below 0.5°K the specific 
heat varies as T? in just the manner (and with the 
correct coefficient) to be expected if only longitudinal 
sound waves could be excited. 


1L, Tisza, Phys. Rev. 72, 838 (1947). An excellent summary of 
the theories of helium II is to be found in R, B. Dingle, Supple- 
ment to Phil. Mag. 1, 112 (1952). 

*L, Landau, J. Phys. U.S.S.R. 5, 71 (1941). 


Tisza’s view is frankly phenomenological. No serious 
attempt is made to justify the description from first 
principles. Landau has made such an attempt by 
studying the quantum mechanics of a continuous liquid 
medium. The role of the statistics is not clear in his 
arguments, however. Furthermore, the magnitudes of 
energy and inertia that the “rotons” appear to have 
correspond to a few atoms. A complete understanding 
of the “roton” state can therefore only be achieved by 
way of an atomic viewpoint. 

A more complete study of liquid helium from first 
principles might attempt to answer at least three 
important questions: 

(a) Why does the liquid make a transition between 
two forms, He I and He II? 

(b) Why are there no states of very low energy, 
other than phonons, which can be excited in helium II 
(i.e., below 0.5°K) ? 

(c) What is the nature of the excitations which 
constitute the “normal fluid component” at higher 
temperatures, say from 1 to 2.2°K? 

The first question was answered in a preceding 
paper.* We showed that London’s suggestion, that it is 
the analog of the transition in an ideal Bose gas, is 
correct. 

In this note we hope to make a qualitative argument 
from first principles to answer the second question. 


3R, P. Feynman, Phys, Rev. 91, 1291 (1953), hereafter called I. 
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(We have not yet found the answer to the third.*) It is 
to be understood, therefore, that we are aiming here 
to explain the properties of the liquid anly at extremely 
low temperatures. 

We take, as a model, helium atoms obeying the 
Schrédinger equation and the symmetrical Einstein- 
Bose statistics with forces between pairs similar to that 
worked out by Slater and Kirkwood, an attraction at 
large distances and a strong repulsion at small.° (It is 
sufficient for qualitative purposes, if one wishes, to 
imagine impenetrable spheres of radius about 2.7A 
packed into a space so the mean spacing (cube root of 
atomic volume) is about 3.6A. 

At absolute zero the system is in the ground state. 
Why this is not a solid has been explained by London.® 
The large zero-point motions of the atoms are capable 
of “melting” any ordered crystalline arrangement that 
may be temporarily set up. We begin by describing 
the wave function for this ground state. 


DESCRIPTION OF THE GROUND-STATE WAVE 
FUNCTION 


Wave functions can be described qualitatively in 
words by giving the amplitude for every configuration 
of the atoms. The ground state has a positive amplitude 
for any configuration since the lowest state has no 
nodes. The amplitude is negligible if any two atoms 
are so close together that they overlap—that is, that a 
large negative potential has set in. Thus for any atom 
surrounded by neighbors, considered for a moment 
fixed, the amplitude falls to zero when the atom moves 
over to touch any of the neighbors and is probably 
bound in such a way that it is maximum when the 
atom is near the center of its ‘‘cage.” (This curvature 
makes a strong kinetic energy tending to blow apart 
the cage, an energy effect canceled by the long-range 
attraction of the atoms.) Compressing the atom into a 
smaller cage requires more kinetic energy, and expan- 
sion works against the attractive potential so there is 
some mean density of equilibrium. Fluctuations away 
from that mean density are of amplitude distributed 
in a Gaussian manner. For wavelengths exceeding the 
atomic spacing they are analogous to zero-point fluctu- 
ations of the vacuum electromagnetic field (but are 
wholly longitudinal, scalar waves, of course). 

There are a large number of configurations with 
densities near the most likely density, in all of which 
the atoms tend to keep separate from one another. 
They differ from one another mainly in the location of 
the atoms. The various configurations differ only in 
that one may be “stirred” into another with a little 
reshuffling or stirring of the atoms. We may take it 


* Note added in proof:—This problem has now been solved. Its 
solution will appear in a forthcoming publication. 

4J. C. Slater and J. G, Kirkwood, Phys. Rev. 37, 682 (1931). 

5 For a detailed account of the properties of helium see W. H. 
ho Helium (Elsevier Publishing Company, Inc., Amsterdam, 
1942). 

’ F. London, Nature 141, 643 (1938). 
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that all configurations which have nearly the same 
type of density fluctuations and which can be essentially 
just stirred from one to another have the same ampli- 
tude in the ground state. 


THE CHARACTER OF LOW-ENERGY STATES 


We must next determine the character of the low- 
energy states near the ground state. We aim to show 
that the only states which differ from the ground state 
by an infinitesimal energy are the phonons, 

First we can take it that the lowest states are those 
involving large numbers of atoms or large distances, 
Consider, for example, a tiny region of the liquid, say 
a cube 3 or 4 atomic spacings on a side. If the atoms 
in this region are confined in this region (so we have a 
submicroscopic sample of liquid He) the excitations 
above the ground state will all involve one node some- 
where among the configurations and hence a wave- 
length of order of a. This must mean an excess kinetic 
energy of the order h?/a’m, or at least of order #?/Na’m, 
where NV is the number of atoms and m is the mass of 
each—leaving an appreciable gap from the ground state. 


This argument fails if in the ground state there are two (or 
more) regions of configuration space in both of which the ampli- 
tude is large and which are completely separated by a region of 
very small amplitude. (Analogous to a particle in a potential 
with two wells separated by a barrier.) If the nodal surface is 
passed through the region of small amplitude (the barrier) very 
little change in energy results. But we have seen that the states 
of large amplitude are just all those {n which the atoms arc 
reasonably well separated. We can assume that we'can gel 
from one to any other without crossing any high potential barrier. 
We suppose all possible rearrangements may be achieved without 
the atoms coming too close together at any time. That this is 
reasonable can be seen by comparing the size of the atoms to 
their spacing. For example, if at some point they are locally 
roughly on a cubic close-packed lattice, the nearest neighbors «re 
4.0A apart (corresponding to the observed atomic volume? 
4S5A:), The diameter of the atoms is 2.7A, the radius at which 
Slater and Kirkwood’s potential passes from minus to plus. The 
cube edge of the lattice is 5.6A, so a face-centered atom coull 
even pass between those at the corners of the cube! Clearly, 1! 
they are allowed to vary their mutual distances a little, all kinds 
of rearrangements can be made. It is likely that the condition 
that there be no, effective barrier between configurations 1% 
equivalent to the condition that the He II is liquid in the lowest 
state. We assume it valid for He II. 


If we are to find extremely low-energy states we mus! 
therefore look to excitations involving large groups of 
atoms or long wavelengths. One possibility is in the 
compression waves. Suppose the atoms are compressed 
to a small excess density over a large volume and 3 
rarefaction left adjacently. The only way this fluctu- 
ation could even out is for a considerable number af 
atoms to move, each a little bit. This involves, effec- 
tively, the motion of a large mass and can have low 
kinetic energy. There is, therefore, little doubt that 
such compressional waves represent a true mode vl 
excitation in the helium, and a mode of very low energy: 
If the speed of sound is called ¢ and the wave number 
of the waves K, the frequency w=cK and quantum 


328 


LIQUID He NEAR 0°K 


energy iicK can be as small as desired. Thus there is 
no reason why we should not expect the specific heat, 
varying as 7°, from these modes.” 

We have assumed that the way to release a density 
fluctuation in a given time which requires the least 
kinetic energy is to move many atoms a short distance. 
Jf only a fraction f of the atoms move, the velocity 
required is 1/f times higher. This means a higher kinetic 
energy if the kinetic energy varies as the velocity 
squared, 

There are cases, however, in which the kinetic energy 
does not vary in this way. In a degenerate Fermi gas a 
single atom excited by a small excess momentum p 
above the Fermi surface of momentum , has an energy 
((pot p)?— po?)/2m or (po/m)p. This linear dependence 
of energy on momentum means that the group velocity 
of such waves is energy independent. It is po/m. The 
speed of sound calculated from the compressibility and 
density of the ideal gas is 3-!4,/m. The single atoms 
will run ahead of the sound. Fluctuations are reduced 
by a process more like diffusion than sound. The specific 
heat near T=0 varies as T instead of T* because the 
density of states for exciting single atoms exceeds that 
for sound. As we shall see, for Bose helium there are 
no states, except phonons, whose energy approaches 
zero as their momentum approaches zero. The sound 
has no competitor capable of discharging pressure.® 


LOW-ENERGY STATES DISREGARDING STATISTICS 


The Bose statistics play an essential part in the 
discussion of other possible states of very low-excitation 
energy. To make this role clear by contrast, we shall 
first analyze the situation, disregarding the statistics. 
More precisely we consider in this section an imaginary 
quantum liquid made of atoms which are, in principle, 
distinguishable (‘‘Bolzman” statistics). The ground 
state for Bolzman statistics is the same as for Bose 
Statistics, since the lowest state is, in either case, 
symmetrical. 

We must try to find modes of excitation involving 
long wavelengths which do not involve changes in 
mean density. The density fluctuation modes have 
already been considered. To simplify the argument 
consider first the following crude model. We consider a 
set of cells, each of which contains one atom. Each 
atom is free to wander in its cell and may occupy 
therefore some ground-state wave function, say constant 
amplitude in the cell. We are to consider states which 
can be made solely by rearranging the atoms among the 
cells. No two atoms may go into the same cell, for that 
corresponds to a density fluctuation, and we do not 
wish to consider those. 

7 The partition function discussed previously (reference 3) is 
extended to include a description of the phonons in the Appendix 
to this paper. 

® For an ideal Bose gas, as T-+0, the sound velocity approaches 
zero, so that the expected 7° specific heat does not appear. It is 
replaced by Ti. Density fluctuations are much more restricted 


in liquid helium than they are in the ideal gas. This is the origin 
of many differences in behavior for the two cases. 
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In the ground state, the amplitude is the same for 
any rearrangement of the atoms among the cells. The 
lowest states can now be analyzed as follows. We 
neglect the statistics, that is, we assume Boltzmann 
statistics to apply. We can describe a wave function 
which corresponds to a very low excitation as follows: 
Pick out a certain atom, A, say. Put it in a given cell. 
Then all rearrangements of the other atoms among the 
other cells can be taken to have the same amplitude. 
If atom A is in a different cell, the amplitude may be 
different, but again independent of the arrangement of 
the other atoms. We thus can specify this wave function 
by giving just the amplitude for various positions of 
atom A. The function is independent of the position 
of the others. We may take this as exp(iK Ra), where 
Ra is the position of the center of the cell in which 
atom A is. We may, with enough accuracy, let Ru be 
just the position of atom A. The K can be a long wave 
fitting into the volume V in which the helium is con- 
tained. The energy of this state is h?K?/2m’ where m’ 
is the effective mass needed to move atom A. This 
energy can be very small, for K can be small. 


This effective mass is not far from the mass of one helium atom. 
It is discussed in a previous paper.? We summarize the argument 
here. To push a single atom along, we need not go over any 
potential barriers. The other atoms may move out of the way. 
No matter where atom 4 is located the other atoms can arrange 
themselves into a state of minimum energy and this minimum 
energy is independent of the location of atom A. However, as 
A moves, the others must readjust themselves into the state of 
minimum energy for the new position of A. That is, in addition 
to the kinetic energy of A there is a kinetic energy of the other 
atoms which must move away to make room for A. 


Thus there would be low-lying states, of energy 
hK?/2m'. The number of such states would be very 
large, for the wave function could depend in similar 
ways on the coordinates of other atoms also, [For 
example, we could choose two atoms A, B and have 
the wave function vary with their location as 
exp(iKi-Ra) exp(tK2-Rag) and be otherwise independ- 
ent of the distribution of the others in the cells, etc. ] 
The large density of low-lying states would result in a 
large specific heat near absolute zero. 


LOW STATES WITH SYMMETRICAL STATISTICS 


However, if the atoms obey Bose statistics none of 
these states can exist. For in our model, whether atom 
A is at one location or another is merely an interchange 
of which atom is which. This cannot change the wave 
function. In fact, for the model of one atom in a cell 
no excited state at all can exist for Bose particles 
without excitation of the atom within the cell. 

For the real liquid a similar situation holds—aside 
from the phonons, there can be no low-lying state. The 
wave function must have the property that any change, 
that just means an interchange among the atoms, must 
not alter the wave function. The excited state must 
be orthogonal to the ground state, of course. Starting 
at any configuration and supposing the amplitude is 
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the same as for the ground state, we must find a new 
configuration that represents a kind of stirring of the 
old configuration (to omit phonon states) such that the 
amplitude is now reversed in sign. It is clear that every 
configuration is close to the original one, albeit with 
some atoms interchanged. So it is hard to find a con- 
figuration to give the minus amplitude which is suffici- 
ently far (in configuration space) from the original 
positive amplitude configuration to have a slow rate of 
decay and thus a low energy. 


There are some possibilities of this kind which might at first 
sight seem allowable. Consider among the atoms a set of adjacent 
ones forming a large ring. Suppose / atoms are in the ring and 
let us imagine the wave function is such that, if all move together 
half an atomic spacing a, the phase changes by 7, so when they 
turn about another a/2 the phase changes by 27, as required by 
the Bose statistics. (For a shift of a just changes each atom for 
the one behind.) The mass moving is m’l, and the momentum is 
h/a (for the wavelength is a) so that the energy is $(m’l)"\(h/a)?, 
This may be made low by choosing / very large. (If it were not 
for the Bose statistics we could have taken the wavelength to 
be Ja and the energy would be even lower, varying as 1//.) 

The argument for calculating this energy is incorrect, however. 
By assuming that all the atoms must move together, degrees of 
freedom (in which parts of the ring turn by themselves) have been 
restricted. This tacitly adds a considerable energy by the uncer- 
tainty principle. 

To understand better the failure of the argument, consider by 
the same reasoning the case that one had two equal rings parallel 
to each other. We may argue as before that now the entire mass 
2m'l moves together with momentum f/a and thus expect an 
energy }(2m’l)"\(k/a)? for the energy of the lowest state. But 
this is certainly wrong, for if we consider the rings as independent, 
the lowest state but one is surely that one for which one of the 
rings is excited and the other not (momentum Q), an energy 
3 (m'l)"1(h/a)*, larger than our previous lowest estimate! The 
error for the first figure of 4(2m’l)-!(h/a)* consisted in this. In 
describing the wave function, the possibility that the two rings 
could turn independently was omitted. To force the rings to 
move together would be to force the difference between their 
displacements to be fixed at zero. Thus the momentum conjugate 
to this difference coordinate would be very high, and the energy 
associated with this coordinate very large. Thus the system does 
not have just the energy estimated but, in fact, a very much 
higher one. We have not completely specified the wave function. 
We have not said what the amplitude is to be for configurations 
in which only one of the rings moves. 

In an analogous way, the estimate of energy for the single ring, 
3 (m'l)"(h/a)*, is incorrect. We have specified the amplitude for 
the case that all of the atoms are simultaneously in the mid 
positions. What is to be the amplitude if only a few move to 
mid positions? (This may be accomplished without doing violence 
to the potentials by using the atoms adjacent to the ring and by 
turning on smaller rings of 4 or 5 atoms.) If the motion (all to 4 
position) can be made up of smaller parts moving in concert, and 
if these smaller parts could also have moved independently, the 
energy cannot be lower than that corresponding to just one of 
independent parts being excited. 


Since we may well imagine that any motion of the 
atoms could be made up of combinations of motions of 
small groups (say 3 or 4 revolving about each other) 
the lowest energy is the excitation of one of these small 
groups. These we can identify with Tisza’s excited 
molecules or Landau’s rotons (but see next paragraph). 
Any such small ring of r atoms must, of course, have 
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its first state of excitation of angular momentum rh (or 
p=h/a) because of the statistics. Landau’s arguments? 
that such angular momentum must have an excitation 
energy was made in a way that does not involve the 
statistics of the atoms. It is possibly equivalent to our 
argument that large rings need not be considered if 
their motion is analyzable in smaller parts, The central 
importance of the statistics would not seem to be here 
but rather in the previous argument which shows that 
no states corresponding to the slow linear motion of a 
single atom are permitted. 

It is not obvious whether the lowest excited state, 
excluding the phonons, is actually a small ring of atoms 
turning. The arguments do not exclude the possibility 
that these are all higher than another type of mode; 
namely, the rapid motion of a single atom. In our cell 
model a state which depends on atom A as expiK-R, 
with Ka=2r is, of course, possible. Another possibility 
is the analog of the excitation of a single atom in a 
cell. (This may be the same as the single atom motion.) 
All of these states differ from the ground state by a 
finite energy. But which is lowest is hard to determine, 

Any such excitation can, of course, move through the 
fluid. (In fact, the lowest state is that in which it has 
equal amplitude of being anywhere in the fluid.) That 
is, the wave function could vary as exp(iK-R), where R 
is the location of the center of excitation. Then the 
energy of these excitations might have the form, 
suggested by Landau,? A+K?/2u, where A is the energy 
needed to excite the ring or other excitation, and 1 is 
a sort of effective mass, 

Our primary purpose was to show that no states 
close to the ground state exist, exclusive of the phonons. 
We are not yet able to calculate the energy, nor to give 
a clear picture of the other modes of excitation. 


DESCRIPTION OF SOME PROPERTIES OF THE LIQUID 


In concluding that only phonons exist at low temper- 
ature, we concur with the opinion of the phenomeno- 
logical theories. Therefore, a description of how some 
of the properties of helium arise, according to this 
model, will repeat much that has already been pointed 
out by others.!:? We limit ourselves, therefore, to 4 
very brief summary from a kinetic theory point of view. 

First, consider the motion of an object, such as 4 
small sphere through the liquid. If the object is sta- 
tionary at a fixed position R, the liquid may get into a 
certain state which, omitting phonons for a moment. 
is like the ground state (except that now part of the 
space occupied by the object is not available to the 
helium). Let the wave function of the helium be wx- 
It is a function of all the helium atoms and depends, 
say parametrically, on R. If R is changed, ¥ is also. 
but the energy of the fluid is not changed. For, re 
arranging the fluid to a new shape at the same density 
does not alter the energy. Now if we alter the R from 
R, to Ry nearby, we will only need to add a little 
kinetic energy to push the helium atoms out of the 
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way. The overlap of pz, and pe will be nearly perfect 
(except near the surface of the object which is a 
different volume of space in the two cases, R=R, and 
R=R:2). The object can move therefore with an energy 
equal to just the kinetic energy of itself and the liquid 
which flows around it. The fluid will move so that the 
curl of the velocity is zero, because circulation corre- 
sponds to permutation of atoms, and the Bose statistics 
will not permit such motions, as we have seen.? 

What will be the losses of energy suffered by such an 
object? If it loses energy it can do so only by exciting 
the helium. First, can it excite the molecular excita- 
tions? These take a certain energy A to excite, but a 
massive object even moving very slowly may have 
sufficient energy. On the other hand, for such an object 
to change energy by A its momentum must change by 
an enormous amount. To create an excitation of very 
high momentum may take much more energy than A. 
As Landau has shown,’ if the energy to excite a roton 
of momentum # is taken to be of the form A+ #*/2u, 
the laws of conservation of energy and momentum 
show that slowly moving objects [velocity less than 
(2uA)*] can produce no excitations, Likewise, objects 
moving at velocities below that of sound cannot lose 
energy by creating phonons. Therefore, at absolute 
zero and for not too high velocity, a moving object 
will suffer no viscous drag. 

At low temperatures, there are, however, some 
phonons already existing in the liquid. They can scatter 
off of the object (changing their energy by the Doppler 
effect) and in this way the object can lose a little energy. 
A phonon of energy fw carries momentum fiw/c and 
behaves very much like a particle of mass iw/c? moving 
at velocity c. The phonons act in most respects like a 
gas of such particles, and the resistance suffered by our 
object is just like the viscosity that would be suffered 
by an object moving through such a gas. 

The actual calculation of this energy loss means a 
calculation of the viscous drag of such a gas. This 
requires a knowledge of the mean free path for collision 
among the phonons. The phonons scatter from one 
another because the medium is not linear. The speed of 
sound depends a little on the density. Therefore speak- 
ing classically, if a wave is present, another wave 
impinging finds the index of refraction varying sinus- 


*For a liquid contained in a simply connected region, the 
citculation vanishes everywhere if it vanishes locally. But in a 
tegion of connectivity like the inside of a torus, although the 
curl {s everywhere zero, the circulation around the ring may not 
be zero. Such a circulation cannot be compounded of smaller 
‘independent units. Therefore, it should be possible to demonstrate 
citculatory motions in such a vessel which will maintain them- 
selves for a long time. Circulation may be created by rotating the 
vessel containing He I and cooling to the temperature desired 
below the point. Stopping the rotating vessel should leave the 
iterior liquid with a nearly permanent angular momentum, 
Which could be demonstrated, for example, by its gyroscopic 
efiects, The liquid must be completely confined with no free 
Surface because the exchange of atoms between the rotating 
liquid and the stationary gas above it might cause a rapid damping 
of the angular momentum. 
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oidally and is thereby scattered. The same thing 
happens in the quantum mechanical system. 

The mean free path should rise rapidly as the temper- 
ature falls, because the density of phonons decreases 
(the scattering cross section also decreases). Interesting 
phenomena should result when this free path becomes 
comparable to the dimensions of the apparatus. 

If the viscosity is measured by connecting two 
vessels with a capillary, a different situation arises. The 
phonons cannot readily work their way through the 
long capillary, but the bulk liquid can move through it. 
There is work done on the phonons as the piston moves 
down in one of the vessels, again by the Doppler effect 
on the phonons bouncing off of the moving piston. 
But this work just goes into increasing the phonon 
energy—that is, the liquid in one vessel is heated, in 
the other cooled. If isothermal conditions are main- 
tained it goes as heat to the walls of the first container 
and from the walls of the second. No net work is done 
in this case and the viscosity appears to vanish. There 
is essentially no energy loss because there is no real 
viscous flow of our phonon “gas” through the capillary. 
The reason that, experimentally, resistance appears® if 
the flow velocity exceeds a certain critical velocity is 
not clear. Perhaps in passing sharp protuberances in the 
capillary wall the velocity locally exceeds that needed 
to create excitations or new phonons. It cannot very 
well be a kind of turbulence because presumably the 
velocity field should be always free of circulation. 

In two volumes of liquid helium connected by a 
capillary, the hotter one will exert the higher pressure 
(fountain effect). The larger number and higher average 
momentum of phonons in the hotter region results, 
from wall bombardment, in a higher pressure there. 
The pressure can only be released slowly by phonons 
passing through the long capillary. This would be the 
mechanism of heat conductivity through capillaries. 
The rate of such conduction would depend on the 
relative size of the capillaries and the phonon mean 
free path. 

If temperature varies from one point to another in 
the bulk liquid, then the phonon density varies, What 
happens depends on the mean free path. If it is long 
compared to the distances over which the variations 
occur, the variations are almost immediately evened 
out by the diffusion of phonons rushing from one place 
to another (at the speed c). If the mean free path is 
shorter than the distances involved in the variation, no 
single photon can go directly from a high- to low-density 
region. Instead, a cooperative movement sets in. If we 
consider the analogy to a gas of phonon “particles,” a 
pressure variation is released by body motion—that is, 
by sound waves. The speed of this sound is 3-} times 
the individual particle velocities. In our case, this 
“second sound” representing waves of phonon density 
(i.e., temperature) should travel at a velocity 3-4. At 
low temperatures it will be experimentally hard to keep 
the mean free path very small compared to the wave- 
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length, so that second sound would show appreciable 
damping and dispersion. At extremely low temperatures 
the free path may be larger than the apparatus. Then, 
if a pulse of heat at one point creates extra phonons, 
these will rush away at speed ¢ so that temperature rise 
will begin at a distant point delayed only by the time 
required for first sound to traverse the apparatus." 


EFFECTS OF He? ATOMS AT LOW CONCENTRATIONS 


If a foreign atom, say an atom of He’, is in the liquid, 
our arguments indicate that it will move about essenti- 
ally as a free particle, albeit with an effective mass m’”’ 
larger than its true mass. (This m” should be about one 
atomic mass unit less than the effective mass m’ of a 
Het atom.) Such He? atoms put into He‘ at low concen- 
tration should behave as a perfect gas. Consider, as an 
example, a concentration of 0.1 percent. The mean 
spacing of the atoms is so large that the gas is not 
degenerate, except at a few hundredths of a degree. 
(The statistics can only be of importance if the atoms 
can permute. This occurs only if exp(—m”D®*kT/2h?) 
is not too small. Here, D is the mean spacing of the 
atoms, which is 36A in our example.) The specific heat 
contributed by the He® is then & per atom. This can 
exceed the specific heat of the phonons (below 0.4° in 
our example). A temperature pulse would then go 
mainly into increasing the energy of the He® gas. The 
speed of sound in this gas is of the order of the He atom 
velocity, and therefore the observed second sound 
velocity should vary as (kT/m’’)} in this region? A 
more detailed analysis of the intermediate region in 
which both He® and phonons contribute requires a 
study of the collision cross sections for phonon-phonon, 
phonon-He’, and He®-He collisions. Higher concentra- 
tions of He® require a study of the degenerate Fermi 
gas. The entire analysis of this paper fails to apply to 
pure He’, because the ground state from which we 
begin is different. 

An atom of He® should show an appreciably higher 
free energy when dissolved in He’, than if that atom is 
replaced by He‘. This is because, as discussed in a 
previous paper,’ for pure He‘ the partition function is 
the sum on all trajectories which start at some con- 
figuration of the atoms z; and return to amy permu- 
tation of the original configuration Pz; With a He 
atom at z, say (and no others nearby), the final 
configuration is limited to only those permutations for 
which this atom returns to z,. We can estimate the 
effect as follows: Neglect the mass difference of He? 
and He‘. Consider the nondiagonal matrix element 
(2’|e~6#|2) in which the final state differs from the 
initial state only in that the atom z; is moved to another 
site z’;. All the other atoms may go to some permutation 
of the original positions. From what we have said in I, 
this should depend upon z, and z’; approximately 


© T am indebted to F. G. Brickwedde for calling my attention 
to this phenomenon, 


331 


P. FEYNMAN 


through a factor (@=1/kT), 
(m!/2xBh?)' exp[ —m! (2.—2'1)*/ 28h"), (1) 


since the atom acts essentially as a free particle of mass 
m'. To get the partition function if the atom 1 is Het, 
we must sum this over all possible sites z’;. At low 
temperatures, where the diffusion distance (28h?/m’)! 
exceeds the atomic spacing d=(V4)}, this is approxi- 
mately the integral of (1) over all 2’, divided by Vu, 
the atomic volume. This gives V4! for He‘. For He}, 
z', must coincide with z: so (1) gives (m'/2nph?)4, 
The ratio of the partition function for He‘ to that in 
which a He‘ atom is replaced by He? is therefore 
(m’RT /2rh?)-3V 4 (at low concentration). The extra 
free energy per atom of He’ is therefore (3/2) AT In (2rh?/ 
m”kTV 4) at low temperatures. (Since He? is lighter 
than He‘, m’ is replaced by m’’.)"! 


DISCUSSION 


A number of problems are suggested by this work. 

First, a detailed quantitative analysis of all of the 
properties of liquid He* below 0.5°K should be under- 
taken with the confidence that the problem is relatively 
simple. Only the phonons should be involved. Their 
wavelengths are long compared to atomic dimensions, 
and we have to do essentially with a continuous 
medium. The statistical mechanical aspects are con- 
sidered in the appendix. The mean free path for phonon 
collisions could be computed if the nonlinearity of the 
medium is included. Work in this direction has been 
done by Landau and Khalatnikov.” An extension could 
be made to include the effects of small concentrations 
of He’. 

A more difficult class of problem, and one which we 
have left completely untouched, is the answer to 
question (c) of the introduction. Namely, what is the 
detailed nature of the excitations involved at the higher 
temperatures of 1 to 2.2°K? Most of the experimental 
work has been done in this region, The atomic view- 
point cannot claim a real understanding of the situation 
in liquid He IT until this problem is solved. (See note 
added in proof.)* 

There is a third group of problems which has nol 
been touched upon. They involve the question, 


(d) What is the mechanism of the Rollin film?® 


This seems to be a problem of the very low-temperature 
behavior and should properly have been discussed i 
this paper. A suggestion of Bijl, de Boer, and Michels 

involves the idea that the energy of a layer of the 


4 To the effect considered in the text, there must be added ig 
large free-energy difference at absolute zero, which arises from t _ 
difference in zero-point energy occasioned by the difference ! 
atomic mass, In first approximation, the wave function is unalters 
but the kinetic energy, — (#?/2m)V2, is higher, for m is 3 ane 
of 4, This difference is, therefore, close to } of the mean kinet! 
energy, per He‘ atom, in the ground state. 

hod D. Landau and I, M. Khalatnikov, J. Exptl. Theoret. 
Physik (U.S.S.R.) 19, 637, 709 (1949). 

8 Bijl, de Boer, and Michels, Physica 8, 655 (1941). 
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liquid depends strongly on the thickness of the layer, 
decreasing for thicker layers, even up to 100 atoms 
thick. If this view is correct, we should look for the 
answer by studying the energy of the ground state, to 
gee if it is dependent on the shape of the container. 
It is possible that such a dependence exists, even for 
thick layers, because of the very long permutation 
rings involved at low temperatures in the condensed 
phase. These rings are long enough to wander over the 
entire volume of the liquid, so that the energy may be 
sensitive to the shape. We have not yet been able to 
verify quantitatively the correctness of this idea. 

Finally, the problem of critical flow velocities, and 
the resistance to high-speed motions, remains unsolved. 

The analysis of pure liquid He® requires a new start 
because our physical arguments so far have depended 
so strongly on the Bose statistics. 

The author has profited from conversations with 
E. Wigner, H. A. Bethe, and R. F. Christy. 


APPENDIX 


In a previous paper,’ I, an approximate partition 
function was proposed for liquid helium. Without 
modification it will not describe the phonon states 
correctly. The necessary modifications are discussed 
here. 

In I, it was noted that the partition function of 
helium is the integral over all configurations z; of the 


quantity 
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using the notation of that paper [I, Eq. (5)]. The 
integral ftrp is taken over all trajectories x;(u) of the 
atoms which start from the positions x,(0)=z; and end 
up at some permutation x,(8)= Pz; of z;. The sum is 
taken on all permutations P. It was pointed out that 
if a configuration z,; contained atoms nearly overlapping, 
or in some other unfavorable arrangement, the im- 
portant trajectories x;(“) would almost immediately 
move to release the energy of the unfavorable arrange- 
ment (for example, overlapping atoms would spring 
apart). The time for this was generally much less than 
8. Thus the various configurations could be given a 
weight p(t, Z2°--Zv)=p(z%). The slower motions of 
atomic diffusion contributed an additional exponential 
factor, so an approximate expression [I Eq. (7)], 
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was proposed. The factor p was to be (for low tempera- 
tures, large 8) nearly independent of 8, as it represents 
the effect of rapid local motions. It is ¢(z¥)?, where 
is the ground-state wave function, 

On the other hand, it was remarked (footnote 11 of I) 
that general variations in density over large distances 
would not be so rapidly released. We now consider in 
detail the effect of these compressional waves. 

To a sound wave of wave number K, and frequency 
w=cK, where ¢ is the speed of sound, correspond 
phonons of energy fiw. A density fluctuation of this 
wavelength A=27K-! would take a time r of order 
1/hw to decay (calling u “time” as in I). This time 
will exceed 8, for wavelengths A> 2mrhc/kT =2n-17A/ 
T°K. Since this distance exceeds the atomic spacing 
3.6A and the diffusion distance (28h?/m)*=4.8A/(T°K)}, 
there is clear separation of these waves from local 
atomic motions. Therefore, an expression like (3) is 
correct locally, for a density fluctuation over a small 
region is rapidly released so that its effect can be 
contained in p (by having p smaller for such fluctu- 
ations). But a fluctuation over long distances will not 
even out in a time @ and is not correctly described in (3). 

Choose a length 1/Ko exceeding the atomic spacing, 
but below the wavelength of sound excited at the given 
temperature (#cKo8>>1). For distances inside 1/Ky no 
new considerations are necessary, and (3) is locally 
correct. For long distances it must be altered. Let 
n(R) be the average number density at R in the 
configuration z;—the average being taken over a region 
of volume 1/Ko*. We shall determine how the proba- 
bility of this configuration depends on #(R) [the result 
is (9) below]. 

We may describe the motions x;(m) as local atomic 
movements (within distances Ko~!) and general drift 
motions of the center of gravity of the atoms in a 
volume Kg. It is convenient to describe the initial 
density distribution by imagining that it arose from an 
initially uniform distribution of density by a dis- 
placement. If the atoms originally at R were displaced 
by Do(R), the density is ~(R)=0(1+V-Do), where 
no=Va™ is the density averaged over the entire fluid. 
As the trajectories in (2) move, there is a general drift 
which we will describe by giving the displacement 
D(R, 4) as a function of u. What is the energy associ- 
ated with this drift? First, to the kinetic energy of the 
atoms due to local motion there is an extra contribution 
from the general drift (m/2V4)(@D/du)? per unit 
volume. Further, suppose a region temporarily has an 
extra high local density. This will limit the path-space 
volume available (and also change the average mutual 
potential energies). Therefore there will be an extra 
factor accumulated in each little interval of time. This 
we can write as a factor e~4"4" for the volume dV in 
time du. This EZ (the energy resulting from the com- 
pression) depends just on the density, hence on V-D. 
We expand it in powers of V-D. The constant term 
may be omitted by changing the zero from which we 
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measure energy. The linear term gives nothing for its 
integral over all the liquid vanishes /-V- DdV =0 from 
the conservation of mass. The quadratic term is 
written conveniently as (1/2)m?V4>(V-D)?, for ¢ 
defined this way becomes the speed of sound. Higher- 
order terms produce phonon-phonon scattering, but we 
neglect them here. Thus, in addition to the features 
which go to make up (3) locally, there is a factor 
controlling the large scale motions, 


= ap! Ge) 


+82 (V- Davin love, u), (4) 


where the D are to be summed over all displacement 


fields such that'* D(R, 0)= Do= D(R, 8), with 
V-Do= Van(R)-1. 
This is best analyzed in momentum space. We put 


DR, u)= f A(K, u) exp(4K-R) aK (2n)-*. 


The A field can be separated into transverse components 
A, and A; and a longitudinal component B= A-K/K. 
Thus 


v-D= [ KEK, uje*®-Rg@sK (27)-3, (5) 
and (4) becomes 
m 8 ep (dB? dAy\? 
mt-st JCS) enn 
Vahey au oe 


OA2\? 
+(—) [exenvu|oa.ck, u)DADB. (6) 


ou 


The transverse displacements act as free particle 
motions, for there is no restoring force. They contribute 
a factor (m/2mBh)! per mode. This is already contained 
in (3) (but with m’ for m, which makes little difference 
for the small fraction VaK,? of modes involved). We 
need not count them again in (6). 

For each longitudinal mode, we must integrate an 
expression of the form 


més 6r 7 dB\? 
exp| _ f (=) + weR Ed | DB(u), 
2Vaked g du 


where 6 is the K-space volume per mode for all B(u) 
which begin and end at Bo. This may be easily done by 
a method explained in another connection by the 


“Tn principle the final displacements could differ from the 
Initial by an atomic spacing d (because atoms may be permuted). 
But the actual displacements permitted by (4) are very much 
smaller than d, so that the only important term from (4) is that 
for which the configuration is restored atom by atom. 
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author,.!® There results (w=cK), 


mou ; mou wBh 
(~~ — ) exn| - Be tanh(—) | (7) 
2rV 4h sinh wBh) Vah 2 


The complete partition function is the integral (2) 
over all initial configurations. We therefore integrate 
(7) over Bo to get the contribution from this mode, 
The integral yields 


[2 sinh (w8h) tanh (4w8h) T= (2 sinh (4w8h) 

=exp(—}ush)[1—exp(—uwfh)}, 
the usual partition function from such a mode. All the 
modes together contribute 


exp| - if In[2 sinh (4wBh) ]d?K (24)-8V i (8) 


the integral extending over all modes of wave number 
less than Ko. This factor in the partition function gives 
the usual Debye specific heat, varying as T° as long as 
our temperatures are, as we have assumed, small enough 
that KochB>>1. 

Multiplication of the factors (7) for all modes tells 
us that density fluctuations have a probability propor- 
tional to 


me? 


eal CX Bo(K) tan ( =") at ©) 


where KBo is the Fourier transform of the density 
fluctuation [from (5) ], 


exp 


KBo(R)=Va f (n(R)—nae® RAR, (10) 
Since, for large w8h, tanh(4w8/) is nearly 1, the density 
fluctuations of short wavelength are independent of 
the temperature. For these the factor (9) could just as 
well be combined with the temperature independent 
factor p to make a new effective p. This shows that 
the results do not depend on the exact choice of Ko. 
For long waves tanh(}w8h) falls below 1 and wider 
fluctuations are permitted than would be expected from 
(3). A more accurate representation of (2) is then (3) 
with p containing general variations in density, these 
variations being weighed by multiplying by the factor 
(9), normalized. A factor (m'/2xBh®)"" should be 
replaced by (8). 

The purpose of this appendix is just to note that (3) 
does not automatically contain the phonon effects, but 
must be modified to include them. The study of such 
questions as the nature of higher-energy excitations 
and the character of the transition can presumably be 
made using (3) without modification. It is convenient 
that the sound wavelengths are so long that a nearly 
complete separation can be made of the local behavior 
and the behavior of the overlying compressional waves. 

18R_ P. Feynman, Phys. Rev. 84, 108 (1951) Appendix Cc. 
The explicit answer is given in R. P, Feynman, Revs. Modern 


Phys, 20, 367 (1948) on page 386, by substituting y=90, ¢;=40 
= Bo, h= —iV l/s, w=twh, T=8. 
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It is argued that the wave function representing an excitation in liquid helium should be nearly of the form 
Z.f(t:)d, where ¢ is the ground-state wave function, f(r) is some function of position, and the sum is taken 
over each atom i. In the variational principle this trial function minimizes the energy if f(r)=exp(zk-r), 
the energy value being E(k) =h*k?/2mS(k), where S(&) is the structure factor of the liquid for neutron 
scattering. For small &, E rises linearly (phonons), For larger &, S(&) has a maximum which makes a ring 
in the diffraction pattern and a minimum in the E(k) vs & curve. Near the minimum, E(&) behaves as 
A+72(&— ko)?/2u, which form Landau found agrees with the data on specific heat. The theoretical value 
of A is twice too high, however, indicating need of a better trial function. 

Excitations near the minimum are shown to behave in all essential ways like the rotons postulated by 
Landau. The thermodynamic and hydrodynamic equations of the two-fluid model are discussed from this 
view. The view is not adequate to deal with the details of the \ transition and with problems of critical 
flow velocity. 

In a dilute solution of He* atoms in Het, the He® should move essentially as free particles but of higher 
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effective mass. This mass is calculated, in an appendix, to be about six atomic mass units. 


N a previous paper,! II, a physical argument was 
given to interpret the fact that the excitations which 
constitute the normal fluid in the two-fluid theory of 
liquid helium were of two kinds. Those of lowest energy 
are longitudinal phonons. The main result of that paper 
was to give the physical reason for the fact that there 
can be no other excitations of low energy. It was shown 
that any others must have at least a minimum energy 
A. No quantitative argument was given to obtain this 
A nor to get an idea of the type of motion that such an 
excitation represents. In this paper we expect to deter- 
mine A and the character of the excitations. 

The physical arguments of II are carried a step 
further here to show that the wave function must be 
of a certain form. The form contains a function whose 
exact character is difficult to establish by intuitive 
arguments. However, the function can be determined, 
instead, from the variational principle as that function 
which minimizes the energy integral. 


THE WAVE FUNCTION FOR EXCITED STATES 


In IT the exact character of the lowest excitation was 
not determined, but various possibilities were suggested. 
One is the rotation of a small ring of atoms. A second 
is the excitation of an atom in the local cage formed 
around it by its neighbors. Still a third is analogous to 
the motion of a single atom, with wave number & about 
2m/a, where a is the atomic spacing, the other atoms 


Fic. 1. Typical configuration 
of the atoms. If an excitation 
represents rotation of a ring of 
atoms such as the six in heavy 
outline the wave function must 
be plus if they are in the a posi- 
tions and minus if they are 
O moved to the intermediate 8 
positions. 


iR, P, Feynman, Phys. Rev. 91, 1291, 1301 (1953), hereafter 
called I, II, respectively. 


moving about to get out of the way in front and to close 
in behind. It is not clear that they are really distinct 
possibilities, for they might be merely different ways of 
describing roughly the same thing. 

We shall now try to find the form of the wave func- 
tion which we would expect under the assumption that 
one or another of these possibilities is correct. It will 
turn out that all of the alternatives suggest the same 
wave function, at least to within a function f(r), of 
position r, which is determined only vaguely. 

First, suppose that the excitation is the rotation of a 
small ring of atoms. The number of atoms in the ring 
is determined, according to II, by the condition that it 
is the smallest ring that can be considered to be able 
to turn easily as an independent unit in view of the 
interatomic forces. For illustrative purposes we suppose 
this means that there are six atoms in the ring. 

We can describe the wave function for this excitation 
by giving the amplitude associated with every configura- 
tion of the atoms. Suppose Fig. 1 represents a typical 
configuration, the six atoms of the ring in question (say 
ring A) being indicated by heavy outline. We discuss 
how the amplitude changes as we rotate this ring, 
leaving the other atoms out of account for a moment. 
Suppose the wave function is positive, say +1, if the 
atoms are in the position shown by the full circles in 
Fig. 1, which we arbitrarily call the a position. Suppose 
all the six atoms move around together, and let the 
ring turn about 60°. The atoms then appear again ina 
position, although which is which has been changed, 
so the wave function, by the Bose statistics, is still +1. 
On the other hand, for a 30° rotation, if the atoms ar¢ 
located as indicated in the figure by dotted circles 
(B position), the wave function will change to —1 fof 
the first excited state. We need only discuss the real 
part of the wave function—the imaginary part, if any 
can be dealt with in a similar way. (Actually since ¥¢ 
deal with an eigenstate of the energy, the real part ° 
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the wave function is an eigenfunction also.) For orienta- 
tions intermediate between a, @ the function is corre- 
spondingly intermediate between +1 and —1, but to 
simplify the remarks we describe it for just the con- 
figurations a, B. The wave function for excitation of this 
ring we call ya. It is +1 if the A ring is ata, and —1 if 
at 8, and does not depend on how other rings of atoms 
are oriented. We can describe this wave function as 
follows. Consider a function of position r in space, 
fa(r) which is +1/6 if r is at one of the six positions of 
the centers of the atoms for the a Position of ring A, is 
—1/6 if it is at a B position, and is zero if r is at any 
other place in the liquid far from the A ring. Then 
consider the quantity >°;fa(r,) where the sum is taken 
over all the atoms, z, in the liquid. For a configuration 
of the liquid for which there are atoms at the six a 
positions the quantity is +1, while if six atoms are at 
8 position, it is —1. This suggests that we can write 
va=Lifa(ti)- 

Actually this is incomplete because it does not 
correctly describe what happens if atoms in other parts 
of the liquid move. If ring A is in the @ position, we 
wish the complete wave function to be +1 as far as this 
is concerned, but to drop to zero if two atoms overlap 
in other parts of the liquid, etc., just as for the ground 
state. That is, we expect (disregarding normalization) 


va=Difalrid, (1) 


where ¢ is the ground-state wave function, a function 
of all the coordinates. This takes care of another matter 
also. What happens if some atoms are on a and some 
on 6? This should be of very small amplitude because 
we do not wish the atoms to overlap on account of the 
repulsions, This is not correctly described by >; fa(r,), 
but the @ factor does guarantee such a behavior. It is 
small for such overlaps. Of course, if the ring contained 
many atoms it could readjust just a little and the ¢ 
would not prevent, for example, all those near one side 
of the ring being a, and those on the opposite side of the 
ring being 8. We are not guaranteed that (1) will de- 
scribe well the amplitude for such a configuration. In 
fact, it wouldn’t be expected that a function of just one 
variable could describe the motion of several atoms. 
However, by the arguments of II the ring is supposed 
to be small, in fact, so small that one part of the ring 
cannot move independently of the rest. The ring is so 
small that if one atom is at a, there cannot be a large 
amplitude for finding atoms at § because of the inter- 
atomic repulsions. This is represented in (1) by the 
factor @ which falls if two atoms approach (see II for a 
full description of the properties of ¢). 

Not knowing the exact size and shape of the ring we 
cannot say what the exact function fa(r) should be. 
But at least we conclude in this case the excited-state 
wave function is of the form 


v=Dif (no, 


where f(r) is some function of position. 


(2) 
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We might try to improve (1) by noting that, of 
course, the energy should be essentially the same if the 
excited ring were somewhere else in the liquid, say at B. 
The function 


Pa= Difalrie (3) 


would describe this if f(r) is +1/6 for r at some one 
of the six a positions of some other ring B, and — 1/6 
for intermediate 8 positions, and zero elsewhere. Or we 
could locate the ring at still another position, etc. 
Any one atom might be thought of as belonging to more 
than one ring. This produces a kind of interaction be- 
tween adjacent rings. Because of this interaction, a 
better wave function than (1) might be some linear 
combination of these possibilities, say caat+cavat +: 
But we can still conclude that the form of the wave 
function is given by (2), but now, with the function 
S@)=cafa(t)t+eafa(r)+--:, for any linear combina- 
tion of functions of the form (2) is still of this form. 

If the lowest excited state which we seek were some- 
thing like the excitation of a single atom in a cage 
formed from its neighbors we would guess the wave 
function to be of the form (2) also. Because there 
would be a nodal plane across the cage, and we would 
take f(r) to be positive if ris in the cage on one side of 
the plane, and negative if on the other, and to fall off 
to zero if r goes outside the cage. We do not care which 
atom is in the cage so the sum on 7 is taken over all 
atoms. Those which are outside the cage contribute 
nothing to the sum, because f(r) is zero there. Further, 
there is no appreciable amplitude for there being more 
than one atom in the cage, because of the action of the 
factor ¢ which is very small if the atoms penetrate each 
other’s mutual potential. The ¢ also takes care of the 
fact that the atoms in remote parts of the liquid behave 
independently of what the excited atom is doing, and 
act just as in the ground state. Further, linear combina- 
tions, representing the alternatives that the excited cage 
may be located at different places in the liquid, are 
still of the form (2). 

The third possibility was only crudely described in IT. 
It was noted that if the atoms were considered as 
roughly confined to cells, then a wave function repre- 
senting the motion of an atom A could be exp(:k-ra), 
where ra is the position of A, and it is assumed that as 
A moves about, the other atoms move around to make 
way for it so that the density is maintained roughly 
uniform. This would correspond in the liquid to a wave 
function 


exp(sk- ra)¢, (4) 


where ¢ is the ground-state wave functions of all the 
atoms including A. The factor # does the equivalent of 
keeping the atoms in cells so that the density is nearly 
uniform no matter where ra is. For small k this is a 
possibility only if atom A is different from the others 
and does not obey the Bose statistics. If the symmetry 
is taken into account then we must replace this by 
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the symmetrical sum 


vi exp (tk : r,)¢. (5) 


If ¢ had no large scale density fluctuations this would 
be no wave function at all, because there would be 
just as many atoms in the region where exp(zk-r) is 
positive as where it is negative and the sum cancels out.” 
This is in concert with the idea that the wave function 
cannot depend on where atom A ison a large scale. For 
if A moves a long distance and the others readjust to 
keep the density uniform, on a large scale (the scale 
1/k for small k), the result is just equivalent to the 
interchange of atoms and the wave function cannot 
change as a consequence of the Bose symmetry. On the 
other hand, if while the atom moves from one position 
to that of its neighbor the wave function changes sign 
and returns, then (5) may be allowed. That is, some- 
thing like (5) with & of order 2x/a may be a possibility. 
This again is of the form (2), but with f(r) =exp(sk-r). 
The argument just given for this alternative is ad- 
mittedly not as complete as for the others, mainly 
because the original idea of what the state is, was 
based on such a crude model of atoms in cells. Insofar 
as the idea can be carried over to the case of the true 
liquid perhaps we can say the form (5), or (2) will 
represent it. 

Since all the examples have led to the same form, we 
might expect that a more general argument could be 
made for the validity of (2). This is, in fact, possible 
starting from the general argument given in II to show 
why the excited states, other than phonons, can be 
expected to have an excitation. It was pointed out 
there that the excited-state function ¥ must be orthog- 
onal to the ground state. For some configuration, say 
a, of the atoms it acquires its maximum positive value. 
Then it will be negative for some other, say 8, which 
represents some stirring from the a configuration with- 
out change of large scale density (to avoid phonon 
states), But stirring reproduces a configuration nearly 
like @ although with some atoms interchanged, Thus it 
is hard to get the configuration 8 to be very far (in con- 
figuration space) from « to keep the gradient of y small 
in going from a to B. 

Fic. 2. In general the lowest 
excitation energy results if the 


x a configuration of atoms (solid 
cS OOO wie! O co circles) for which the Oe 
a function is most positive is as 
se far as possible from that 
(dotted circles, 8) for which it 
is most negative. All the 8 posi- 
tions must be as far as possible 
from a positions, therefore. 


Or 02000 we) 


? We shall see later that (5), for small k, is actually a satis- 
factory wave function because ¢ does have the long wave density 
variations of the zero point motion of the sound field. We are 
trying to get excited states orthogonal to phonon states, and (5) 
for small k is not orthogonal. It is, in fact, just the wave function 
for such a phonon state. This is discussed later. 
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The lowest state would have the 6 configuration as 
far as possible from a. This means that in 8 as many 
atoms as possible are moved from sites (call them « 
positions) occupied by atoms in a. Hence 6 must be a 
configuration in which the atoms occupy sites (g 
positions) which are placed as well as possible between 
the positions. (See Fig. 2.) In all these configurations, 
of course, the gross density must be kept uniform and 
the atoms should be kept from overlapping, to avoid 
high potential energy terms. If all atoms are on a 
positions y is maximum positive, and if all on 8, maxi- 
mum negative. The transition is made as smoothly as 
possible, and the kinetic energy thereby kept down, if 
for other configurations the amplitude is taken to be 
just the number of atoms on @ positions minus the 
number on § positions. The number is just >o:f(r,) 
where f(r) is a function which is +1 if r is at ana 
position, and — 1 if at a B position (and varies smoothly 
in between these limits as r moves about). It is of course 
a modulation to be taken on ¢, because we wish to give 
sma]] amplitude to configurations in which atoms over- 
lap, etc., just as in the ground state. We are led, there- 
fore, to (2). We can add the information that f(r) must 
vary rapidly from plus to minus in distances of half an 
atomic spacing. That is, we expect that f(r) will consist 
predominently of Fourier components of wave number 
k of absolute magnitude k= 27/a. 

In the above argument it is not self-evident that 
in going from the configuration of all atoms at a posi- 
tions to that of all at 8, the amplitude must be just 
linear in the number on a, N, minus the number of 
8, Ng. Perhaps some other smooth function of this 
number, like sinfw(Na—N,)/2N] might be better. 
However, for the majority of possible configurations 
Na, and Ng are nearly equal; in fact, for almost all, 
(Na—Ns)/N is of order +-N-}, For such a small range 
of the variable, the function, whatever it is, ought to 
behave nearly linearly. If the wave function (2) is 
wrong for a very few special configurations it will not 
be important as we shall determine the energies by the 
variational method, and the special configurations will 
contribute only a small amount to the integrals because 
of their small share of the volume in configuration space. 


THE EXCITATION ENERGY 


We have concluded that a function of the form (2) 
should be a good approximation to the wave function 
of the excited state.%* The function f(r) is known only 
imperfectly, however. We shall determine this function 
f(r) by using the variational principle. The Hamil 
tonian of the system is 


H=— (2/2m)XV2+V — Eo, (6) 


22 Wave functions of this form have been proposed before, ie 
example by A. Bijl, Physica 7, 869 (1940). However, an arguinen 
establishing their validity for large k has been lacking, and ¥ 
has not been clear that functions of other forms might not 8!¥ 
much lower states. 
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where V is the potential energy of the system, and we 
measure energies above the ground-state energy Eo, 
so Ey is subtracted in (6). Therefore the ground-state 
wave function satisfies 

Ho=0. (7) 
If we write 


y= Fo, (8) 


where F is a function of all the coordinates, then we 
can verify, using (7), that 


Hy=H(F¢) = — (?/2m) UVF 
+2V 6: ViF)=67(—h?/2m)DV¢ (owViF), (9) 


where py (r¥) =¢? is the density function for the ground 
state, that is, the probability of finding the configura- 
tion r¥ (we use r¥ to denote the set of coordinates r; 
of all the atoms, and /'::+d*r to represent the integral 
over all of them). 

The energy values come from minimizing the integral, 
(note ¢ is real) 


a= [yrHiyars 
= (8/2 Ee [ (AP) (VA pwde, (10) 


subject to the condition that the normalization integral, 


s= [yrvare= f PeFpnas, (11) 
is fixed. The energy is then E= &/3. 
In these expressions we must substitute 
F= Qi if(n). (12) 


Consider the normalization integral first. It is 
$=Dids fr (rj) f(r) pnd. 


For a fixed 7 and j we can integrate first over all of the 
other atomic coordinates. This integral on py gives the 
probability for finding the 7th atom at r; and the jth 
at r;; therefore 


s= f Me) fedoreurderr, (13) 


where pz is the probability of finding an atom at r per 
cm}, and at re per cm. These density functions can 
be defined in general by 


px (tyres + +3) Dad ag Don fecs—n 


X6(ry—1e’)+- Snr Jon (r*)d*r. (14) 
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For example, p:(r) is simply the chance of finding an 
atom at ry’, for the liquid in the ground state. This is in- 
dependent of r and is the number density pp in the 
ground state. In the same way pe(r,re) can be written 
as poP(ti—re) where # is the probability of finding an 
atom at re per unit volume if one is known to be at ry. 
Except near the liquid surface it is a function of only 
the distance from r; to rz, so (13) is 


S54 f Pn) f()e(n—n)dndr, (15) 


The energy integral (10), with the substitution (12) 
becomes 


= (h/2m) Xs i Vif*(ts)- Vif (rend. 


The integral of py over all atomic coordinates except r; 
gives a result involving only p,(r,)=po. Therefore we 
have simply 


&=po(h?/2m) f VP (r) Vf (ede. (16) 


The best choice of f is that which minimizes the ratio 
of (16) to (15). The variation with respect to /* gives 
the equation 


E f p(ry—12) f (12) d're= — (h2/2m)V8f (n), 


where the energy Eis &/9. This has the solution 


f(t) =expi(k-r), (17) 
with the energy value 
E(k) = W?k?/2mS(k), (18) 


where S(k) is the Fourier transform of the correlation 
function, 


S(k)= fe exp(tk-r)d'r. (19) 


It is a function only of &, the magnitude of k. 

It is readily verified that the solution is orthogonal 
to the ground state if we exclude k=0. In fact, the 
solutions for different values of k are orthogonal to each 
other. This is because they all belong to different eigen- 
values, ik, of the total momentum operator 


P= (h/i) Divi, 


as is directly verified from (2) with (17), taking P6=0 
since the ground state has zero total momentum.! Since 


3 The argument is not rigorous because the momentum of the 
entire Hquid can be changed without appreciable energy change 
by moving the center of tea This multiplies the wave function 
by a factor like exp(—ihkN—- Z,r;). This function is so different 
from (2), however, that the orthogonality is probably not 
destroyed. 
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this operator commutes with the Hamiltonian, we have 
in (18) an upper limit to the energy for each value of &. 
(In fact, we could have obtained (17) from (2) by this 
argument.) Since we expect that (2) is a good wave 
function for functions f which vary from plus to minus 
in a distance of order a/2, we expect that (18) is not 
only an upper limit, but also a good estimate of the 
energy in a range of k’s in the neighborhood of k= 27/a. 
In fact, our arguments suggest that E(k) should have a 
minimum as a function of & in that region. These ex- 
pectations are verified in the next section. 


DISCUSSION OF THE ENERGY SPECTRUM 


To find the consequences of (18) we shall have to 
discuss the behavior of 5(&) defined in (19). 

The function p(r,—r2) gives the probability per unit 
volume that a particle is at re if one is known to be at 1. 
If re is close to r, it is zero, for the atoms cannot overlap. 
On the other hand, if re coincides with r; there is an 
atom there, so f contains a delta function 6(r;—re). 
For large rz it approaches po. Since the structure of the 
Hiquid ought to be more or less like that in a classical 
fluid, as r increases from zero, p(r) probably rises to a 
maximum at the nearest neighbor spacing, falls, then 
rises again to a lower and wider maximum for next 
nearest, and with rapidly decreasing smaller oscillations 
approaches unity.‘ The integral / (p(r)—1)d*r vanishes 
since the integral of p2(1,re) with respect to re is exactly 
p: times the number of atoms N = poV. 

The Fourier transform function S() is just the liquid 
structure factor which determines the scattering of 
neutrons (or x-rays, after multiplication by the atomic 
structure factor) by the liquid at absolute zero. It is 
therefore a quantity which can be directly determined 
experimentally. For large & it approaches 1 because of 
the delta function in f(r). It has a delta function at 
k=0, but this value of & is not of interest to us in (18), 
because the wave function ¥ must be orthogonal to the 
ground state. The behavior at small & depends on the 
variations of #(r) over long distances, that is, on long 
wavelength density fluctuations. These are the zero 
point fluctuations of the sound field in the ground state, 
since for wavelengths longer than the atomic spacing 
the approximation of a continuous sound field is good. 
This may be analyzed as follows. The operator repre- 
senting the density at a point ris 


e(t)=26(r—1). 


Its Fourier transform is 


(20) 


x= fom exp(tk-1)d*r= S0; exp(tk-r,). (21) 


Evidently, the S(k) is the expected value of |u|? in 
the ground state. For long wave sound gx is just the 
coordinate of the normal mode, so its mean square can 


4J. Reekie and T. S. Hutchison, Phys. Rev. 92, 827 (1953), have 
determined p(r) by x-ray scattering. 
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be easily determined, for example, by noting that the 
mean potential energy is half of the ground-state energy 
3hw. In this way one finds S(k)=hk/2me for small , 
where ¢ is the velocity of sound. 

The behavior of S(k) for intermediate & is familiar 
to us from the x-ray studies of classical liquids. The 
density distribution in the ground state is roughly 
similar to such a liquid. There is some local structure 
produced by the tendency of the atoms to stay apart. 
This quasi-crystalline local order makes a maximum 
in the S(&) curve for k near 2x/a, There may be smaller 
subsidiary maxima for near multiples of this &. For 
helium, because of the large zero point motion, these 
maxima may be broader and less marked than in other 
liquids. The main maximum is responsible for the main 
ring in the x-ray diffraction pattern. It is shown clearly 
in the preliminary neutron diffraction data reported 
by Henshaw and Hurst.’ 

To summarize: with rising &, S(R) starts linearly as 
hk/2mc, rises then to a maximum near k=27/a, and 
falls again to approach, with possible minor oscillations, 
the limit unity. Consequently the quantity E(k) = /?k?/ 
2mS(k) should start linearly as kkc, but should then 
show a dip with a minimum at k= kp say, near 27/a, 
finally rising, eventually as hk?/2m. These relations 
are shown in Fig. 3. 

We have argued that (2) should be a good approxima- 
tion to the wave function for functions that contain 
wave numbers in the vicinity of 27/a. Therefore we can 
expect the energy values (18) to be good in the neigh- 
borhood of this wave number. It is gratifying to see 
that there is a minimum in this region. The minimum 
value we shall call A. Ordinarily the variational method 
only permits one to interpret the minimum value of E 
as one varies a parameter such as &. On the other hand, in 
our case each value of & has significance since these values 
correspond to different eigenvalues of the momentum 
operator, as has been remarked. Therefore we can 
believe the behavior of the curve through a range of 
k near ko, where it behaves parabolically, so we can 
write E(k) in Landau’s form A+-A?(k—ko)?/2u where 
wis a constant determining the curvature. 

It is at first disconcerting that values of the energy 
lower than A can be obtained by going to very small 
values of &. But the energy here varies as hkc, just that 
expected for phonon excitation. In fact, a moment’s 
reflection shows that, for small &, the wave function (2) 
is just that which represents phonon excitation. Excita- 
tion of a given phonon means that the harmonic oscilla- 
tor representing the corresponding normal mode is in 
the first excited state. The wave function is therefore 
Qu, if gx is the normal coordinate of the mode excited. 
This coordinate is the Fourier transform of the density, 
so (21) shows that (2) with (17) represents a phonon 
for small &. Since the wave function is correct the 
energy must be exact, and is therefore hkc. 


5D. G. Henshaw and D.G. Hurst, Phys. Rev. 91, 1222 (1953). 
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Although we have made an argument only to show 
that (2) should be valid for high k, we see now that it 
is also valid for small &, that is for f(r) which vary 
slowly. Since the energy curve is valid for the smaller & 
and for a range about 27/a, we can accept it as reason- 
able for all & from zero up to and shghtly beyond the 
minimum. 

On the other hand, for still larger &, another state of 
lower energy exists with the same total momentum. 
Tt is the state of double excitation, one of ki, the other 
of Ke, such that kit-ke=kand still E(ki)+ (ke) < E(k). 
This becomes possible for & so high that the slope dE/dk 
of the energy curve exceeds hic, the initial slope. The 
curve for very large k, therefore, does not have the same 
validity as that for lower &, but we need not enter into 
this matter, because at temperatures of a few degrees 
such high-energy states would not be appreciably 
excited. Such questions may be of importance in dis- 
cussing nonequilibrium phenomena. One process by 
which the number of excitations can change is for an 
excitation to pick up enough momentum that it can 
divide spontaneously into two. 

It is easy to misinterpret the meaning of the wave 
function 


v= Diexp(ik-1)¢, 


so a few remarks might be appropriate here. It looks 
at first, on inspection of the first factor, that this repre- 
sents the excitation of a single particle. This is correct 
at very high k(ka>>2r) and it is also correct for the 
ideal gas case for which the atoms do not interact 
(@ is constant then). But our arguments for inter- 
mediate & show that this is not the case. Because of the 
correlations in position implied by the factor ¢, the 
motion of one atom implies the motion of others. Thus 
the factor in front of ¢ selects from that function certain 
correlated motions, in spite of the fact that each term 
in the factor depends on just one variable. 

We can get a better idea of how this works by taking 
the extreme case of very low &. Here (22) represents a 
sound wave but at first sight there is no sign of the 
density variations that such a wave usually brings to 
mind. Let us take the real part and consider 


Di cos(k-r)¢ 


for small &. Now, for most configurations, allowed by ¢, 
the atoms are fairly uniformly distributed, so that 
there are just as many in the regions where the cosine 
is positive, as where it is negative. Therefore the sum 
over all the atoms of cos(k-r,) is zero. The wave func- 
tion is zero for nearly all configurations. It is only for 
the rare configurations in which the number in positive 
regions exceeds that in regions where the cosine is 
negative that the wave function does not vanish. In this 
way (23) selects configurations for which the mean 
density varies as cos(k-r). Since such density fluctua- 
tions are, according to the behavior of ¢, most likely 
produced by small cooperative motions of large numbers 


(22) 


(23) 
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Fic. 3. The upper curve gives the liquid structure factor deter- 
mined from neutron diffraction (reference 5) and extrapolated to 
zero k, The lower curve gives the energy spectrum of excitations as 
a function of wave number (momentum: #-!) which results from 
the formula E= h®k?/2mS(&) derived in the text. The initial linear 
portion represents excitation of phonons while excitations near 
the minimum of the curve, where it behaves as A+h?(k— ko)*/2u, 
correspond to Landau’s rotons, However, data on the specific heat 


indicate that the theoretical curve should lie lower, closer to the 
dashed curve. 


of atoms, the state described is very far from the one 
particle state it would be if the cosine factor appeared 
alone, not multiplied by ¢. 

In the region of the energy minimum at ky the wave 
function represents a situation intermediate between 
the cooperative motion of phonons, and the excitation 
of a single particle. Several atoms move together be- 
cause of the correlations implied by ¢. It is hard to make 
a clear picture out of this vague idea. There is nothing 
to indicate that the state carries an intrinsic angular 
momentum. One must be careful because the state is 
degenerate, as all directions of & with the same magni- 
tude kg give the same energy A. Perhaps, if more com- 
plicated wave functions were tried, some special linear 
combination representing a kind of microscopic vortex 
ring or one with intrinsic angular momentum has in fact 
a lower energy. States of low & will be called phonons, 
and states of momentum near &p will be called rotons in 
this paper, in accordance with the terminology of 
Landau,* although we do not necessarily mean to imply 
that rotons carry intrinsic angular momentum or repre- 
sent vortex motion. 


MULTIPLE EXCITATION 


We have obtained the energy spectrum E(k) of what 
we may call single excitations. They have the form of 


°L, Landau, J. Phys. U.S.S.R. 5, 71 (1941); 8, 1 (1941). See 
also R. B. Dingle, Supplement to Phil. Mag. 1, 112 (1952). 
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plane waves through the liquid. By taking linear com- 
binations we can make wave packets that are more or 
less confined to a local region. Unless the region is very 
small there would be only a negligible energy addition 
required to do this. The remainder of the liquid is quiet 
as in the ground state. It is conceivable that another 
packet could be located somewhere far away and the 
energy would be close to E(k,)+E(ke) if k, and ke are 
the momenta of the majority of the waves in each 
packet. Thus we should expect states with several 
excitations, the energy being the sum of the energies 
of each packet separately. This neglects a kind of in- 
teraction energy between them. It will be valid if the 
density of excitations is very small, but one cannot 
expect to apply it to situations in which the number of 
excitations is any appreciable fraction of the number 
of atoms in the liquid. 
Mathematically, if the function 


F=)0; expik,-r; 


gives E(k,) for the energy, we might expect the wave 
function for two excitations to correspond to y=F¢, 


with 
F=[3S; exp(tk,-r,) [0; exp(tke-r,) J. (24) 


It is readily verified, by substitution into the variation 
integral, that the energy is E(k,)+- (Ke) within correc- 
tion terms of order 1/V, where V is the volume of the 
entire fluid. This is just what one would expect if 
the excitations behaved like interacting particles, for 
the relative probability of their being within their range 
of interaction varies inversely as the volume. The ex- 
pression (24) is unaltered on reversing the order of the 
factors, so the state in which the first excitation has 
momentum k, and the second has ke is the same as that 
in which the momenta are reversed. Thus the excitations 
obey Bose statistics. The expression (24) is not orthog- 
onal to (17) with k=k,+kz, so there undoubtedly are 
matrix elements between states of different numbers of 
excitations, and collisions must be possible which change 
this number. In summary, the excitations behave much 
like interacting Bose particles which may be created 
and destroyed, and whose energy as a function of mo- 
mentum is given by E(k) =h?k?/2mS(R). 


THERMODYNAMIC PROPERTIES OF HELIUM II 


From this we may determine the thermodynamic 
behavior of liquid helium at low temperature. At suffi- 
ciently low temperatures the number of excitations will 
be small, so the interactions between them can be 
neglected. The approximation of independence leads 
in the usual way to the formula for the Gibbs free energy 
(taking the ground-state energy as zero), 


F=kTV/S |n{1—exp(—SE[k]) ]d*k(21)-3, (25) 


with B= 1/kT. The number of excitations of momentum 


k is 
n= LexppE(k)—1P. (26) 
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We need not enter into further details as this has been 
thoroughly analyzed by Landau,® who first proposeq 
the form of energy spectrum we have deduced here. At 
low temperatures only the lowest energy excitations 
can become excited. That is, only the phonons are 
excited and the specific heat varies as T*. At higher 
temperatures some of the states near the minimum of 
the curve, at ky become excited. The specific heat then 
rises rapidly, controlled predominantly by the exp(—fa) 
factor, governing the number of rotons excited. For 
temperatures of a few degrees few rotons are excited and 
only the phonon part, and the part of the curve near the 
minimum, are important. Landau’ has shown that one 
obtains good agreement with the specific heat (and with 
the measured values of the velocity of second sound) 
if one chooses the parameters A=9.6°K, ko= 1.95 A“, 
and #=0.77. This means the energy curve near the 
minimum behaves as 2mE/h?= 1.6-+1.3(k —1.95)?, with 
k in reciprocal angstroms (one A~? corresponds to a 
temperature of 6°K). In the phonon region the curve is 


2mE/h? = 2.6k, 


in the same units, if the speed of sound is 240 meters/sec” 
Henshaw and Hurst® have published some preliminary 
data on the neutron scattering by liquid helium at 
4,2°K. From it S(k) may be directly determined (see 
Fig. 3). The curve for E(k) calculated in this way 
behaves as 

2mE/h? = 3.04-1.0(k— 2.0)? 


near the minimum (and is consistent with 2.6% for 
small k). This corresponds to a value of A of 18° which 
is impossibly large. Such a discrepancy may be due 
to the inaccuracy of the trial function (2), the true 
energy being lower than that calculated with this trial 
function. Such a large discrepancy in energy is dis- 
couraging, because the physical arguments did seem to 
indicate that (2) should be a reasonably good first 
approximation. 

The expression (25) should not hold at high tem- 
perature because it neglects the interactions among the 
large number of excitations which (26) demands at 
such a temperature. Without an estimate of these inter~ 
actions it is hard to judge the region in which deviations 
are to be expected. We shall make a very rough pre 
liminary argument here. 

To the approximation that the energy in a mode is 
proportional to an integer n, this mode behaves like 4 
harmonic oscillator. The coordinate of this oscillator 
qx has a mean square value 2n-+-1 times its value in the 
ground state. For what size qx is the harmonic oscillator 
approximation poor? If we knew this we could put 4 
limit on the ranges of gx and hence of mx, for which (25) 
might be expected to be valid. In our case the various 
qx from (21) are not independent, because they 4? 
all be defined in terms of the same 3N variables 1 


7L, Landau, J. Phys. U.S.S.R. 11, 91 (1947); Phys. Rev. 75; 
884 (1949). 
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Thus, for example, if gx is known for 3N values of k, it is 
known in principle for all others. It is very difficult to see 
what this interdependence means to (25). 

But one can notice that one restriction is 


f lox |20°k(2n) =, (27) 


where the integral is taken over all k. Hence we may 
guess that we should restrict (25) by the condition that 
the excess | q«|? over that for the ground state, summed 
on all states, cannot exceed V. This excess for a given 
mode is 2”, times the ground-state mean value of 
|qx|?. This latter is S(k)=7°R°/2mE(k), so we obtain 
the restriction 


Wnt f (B/E(H) ma R2ny9=1. (28) 


The thermodynamics which would result (by adding 
a chemical potential » to F(R) in (25), (26) to allow for 
(28)) will show a second-order transition.’ But without 
a deeper analysis we are in no position to take Eq. (28) 
literally. We can only use it as a rough criterion for 
validity of (25). If the integral on the left is much less 
than 1, then (25) should hold. At 2°K the integral 
amounts to roughly 0.2, so perhaps we are entitled to 
trust (25) even to within a few tenths of a degree of the 
transition temperature. 


MOTION OF THE FLUID AS A WHOLE 


The existence of such excitations moving as nearly 
free particles in a background fluid is the central con- 
cept of the two-fluid model of Tisza? and Landau. The 
consequences of these ideas for excitations with a 
spectrum such as (18) have been carefully analyzed in 
a general manner by Landau and Dingle.* There is 
nothing to add that is new in this direction. However, 
we shall review briefly how the equations of this model 
arise, emphasizing the behavior of the wave function. 

Beside the states which represent local internal excita- 
lion of part of the fluid, there are, of course, states in 
which the entire body of fluid moves. In general, in 
these cases the boundaries of the fluid move also. For 
example, at absolute zero, the entire fluid may move as 
a body with velocity v. This center-of-gravity motion 
is described by 


y= exp(imv- > Tio, 


if we assume ¢ corresponds to the ground state at rest 
in the laboratory system. 

Suppose we wish to represent a situation in which 
the velocity v(r) varies from point to point, but only 
very gradually on an atomic scale. We might try some- 
thing like this. The atoms in a region about some point 
P have their center of gravity moving at velocity vp 

6 This conclusion is modified if the interaction of the rotons and 


the hydrodynamic modes is taken into account. 
*L. Tisza, Phys. Rev. 72, 838 (1947). 
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corresponding to this point. They must contribute a 
phase mvp: >>; where the sum is taken only over those 
near P. Corresponding contributions would come from 
sums near other points so the total factor ought to be 
exp[im>_.v(r,)-1,]. The wave function is, therefore, of 


the form 
v=explid s(t.) 16, (29) 


where s(r) is some function of position. We have sug- 
gested that it is mv(r)-r. However, as is usual when one 
has waves whose wavelength varies from point to point, 
the wave number is not the phase divided by r, but 
more accurately it is the gradient of the phase. There- 
fore (29) represents the fluid in motion, the velocity at 
any point being given by 


v(r)=m—Vs, (30) 


As a consequence of (31),VXv=0. Velocity fields for 
which this is not true cannot be represented in such a 
simple manner, and represent, as we have seen in II, 
states involving large numbers of excitations. The prob- 
lem they present is being studied. For regions which are 
not simply connected, such as a torus, s need not be 
single-valued. For example, in the torus we could take 
s=@, the cylindrical angle. This would represent a 
permanent circulation’ even though VXv=0 locally, 

Substitution of (29) into the variational principle to 
obtain a steady-state solution leads [see (10), (11) with 
F=exp(i>_is(r.)) ] to the energy expression 


~ frst -vsteer, (31) 


which js the kinetic energy pomv?/2 per unit volume. It 
is Minimum for variations in s if V-(Vs)=0, that is, 
V-v=0. The flow must be incompressible. We have not 
allowed, in (29), for variations in density. For a singly 
connected region this has but one solution v=0, unless 
the boundaries move. In a multiply connected region, 
hke a ring, circulation of angular momentum in mul- 
tiples of V/ is possible. There are so few of these special 
states that the statistical mechanics is not affected. The 
variables, v(r), representing such motions can be 
specified as external known variables like pressure and 
volume. 


EXCITATIONS IN A MOVING FLUID 


Next we study the motion and energy of an excitation 
in a moving fluid. The wave function is 


¥=Dif(rs) expltXss(rs) ob. 


10 Tt was suggested in II, reference 9, that to observe this experi- 
mentally one might have to avoid letting the liquid have a free 
surface. But R. Peirels has pointed out (private discussion) that 
although the atoms evaporating from the moving liquid to the 
gas carry angular momentum out, only those of the gas which are 
moving along with the liquid can condense, bringing back angular 
momentum—so in equilibrium there would be no damping of 
the motion from this effect. There are other effects, however, 
such as those which cause resistance to capillary flow at high 
velocities, which might be expected to damp the motion. 
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When this is put into the variational principle, one 
finds directly 


6/9=EQ)+m fi (r)-v,(r)d'r 
+4 f plr)v.(t)-ve(e)ér, (32) 


where E(k) is given in (18), v,(r)=m—Vs, and we have 
defined 


p(a)= fos (a,niste) f* (1) f (12) Pridre/T, (33) 
ja@=h [mance ors 
; —f(r)vf* (a) |@r/Imi, (34) 
with 
T= fates) (ry) f (r2)d*r dre. (35) 


These p, j, are the expected values of density and 
current density that belong to the state representing a 
single excitation. 

We shall analyze this for the case of a single excita- 
tion of nearly definite momentum in the form of a large 
wave packet, large compared with the central wave- 
length. That is, we take f(r)=exp(ck-r)g(r) where g(r) 
is a smooth amplitude function, such as a Gaussian, 
with width very large compared to 1/k, but small 
compared to the size of the vessel. Such a packet will 
drift and spread slowly in a way completely determined 
by E(k) and the principle of superposition. We wish to 
determine the additional effects of the possibility of 
general liquid flow. 

The current density associated with this excitation 
is found from (34). We make the approximation that 
Vf=ikf since g varies so slowly, obtaining, 


i(a)= Hho f pala) @*(1)g(a) exp sk: (ra) 
+g (r)g* (a) expl+7k- (r—a) }}d*r/T. 


Now, because of the variation of the exponentials, con- 
tributions to this integral come only from r within a 
limited distance from a. Within such a distance g(r) is 
nearly the same as g(a), so all the g factors can be 
evaluated at a and taken outside the integral. The 
integral on r is then easy by (19) and one finds 


j(a)=hkpoS(k) | g(a) |2/mT. 


The normalization integral may be done in a similar 
manner. It is 


I= f fereretesyon(esn) 


x exp[— ik- (r— To) Jé@rndre. 
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Since rz must be near r, for a large contribution, we may 
replace g(r2) by g(r), integrate re directly and obtajp 


T= post) f |(e)|*@r 


If we assume g(r) is normalized, /'|g(r)|?a°r=1, so 
that |g(a)|?=d(a) is the density in the packet at a, or 
roughly the probability of finding the excitation at a, 


the current is 
j(a)= (hk/m)d(a), (36) 


that is, a total current 2k/m distributed at density d(a), 
The particle density at a is 


p(a)= fos (a,r,r2)g* (1i)g (r2) 


Xexp[—7k- (r,—1r2) J@rid're/T. (37) 
The points r, and rz must be close. If point a is not close 
to these points we can use the asymptotic form, 


(38) 


to show directly that p(a) is the density pp of the fluid, 
far from the packet. It is nearly so, even in the region of 
the packet, for since its dimensions are large, a is nearly 
always far from r,,re, and further, the integral over all 
a of p3(a,rir2) — pop2(r1,r2) is exactly zero. 

It is true that the distance of influence in p may not 
be very small, because of the correlations in the sound 
field. That is, the excitation produces a small strain in 
the fluid which makes a field of stress in the vicinity. 
Such fields provide a mechanism of interaction between 
excitations (as well as a correction to the energy of one). 
In a more detailed analysis such effects should be 
taken into account. Here we proceed to a first approxi- 
mation and neglect them. To the approximation of 
neglecting compressibility, then, we find p(a)=po; the 
presence of an excitation does not change the fluid 
density. 

Thus we picture an excitation in the form of a 
drifting wave packet as carrying a total current hk/m, 
and drifting (if v=0) at the group velocity v,=0E/dk, 
but as not appreciably altering the density. 

This clearly violates the conservation of matter. For 
a moment we overlook this difficulty. It is discussed in 
the section following the next. 

If this packet is in a general velocity field v,(r) we 
may determine its energy from (32). In integral 
Ji(r)-v.(r)d'r we shall assume that v, does not vary 
appreciably over a region as small as the packet, and 
may be taken outside the integral sign. The integral 
of j is then kk/m= p/m, giving the following results: 

The energy of an excitation in a moving fluid is 


E=E(p)+p-¥e, (39) 


where Vy, is the velocity of the fluid where the excitation 
(considered as a packet) islocated. The total momentum 


p3(a,h1,82) = pope (1,82); 
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associated with the packet is p, and it contributes a 
current p/m to the total in the fluid. The energy con- 
tributed by the moving fluid has density pomv,?/2, and 
it contributes to the current density pov.. 

The group velocity of the excitation is 0E/dp so that 
v,=VsotV. where Vyo is the group velocity in liquid at 
rest, JE(p)/dp. Thus the excitation just drifts along 
with the background fluid motion, of velocity v,. Equa- 
tion (39) can be obtained much more simply by a 
Galilean transformation of coordinates. Indeed, it is in 
this way that it was obtained by Landau and Dingle.® 


RELATION TO THE TWO-FLUID MODEL 


The two paragraphs at the end of the previous section 
contain the main relations by which the hydrodynamics 
and thermodynamics of the two-fluid model is derived. 
We review here a few of the steps, very briefly, in order 
to make clear the relation of the excitations to what 
is called the normal fluid. For further details see 
reference 6. 

Thermodynamic equilibrium results, for a system 
with Bose statistics, if the excitations are distributed so 
that the number of those with energy £ is 


n=[exp(GE)—1]}". 


This may be obtained, for example, by maximizing the 
entropy, keeping the total energy constant. Another 
distribution which is also in equilibrium can be got by 
maximizing the entropy, keeping both total energy and 
total momentum constant. It is 


n= {exp[6(E—p-u)]—1}7, 


where u is a constant. 

In our liquid, the density of excitations per unit 
volume at a point r in this case would be, substituting 
(39) into (40), 


n= {exp[G(E(p)+p-v.(r)— p-u)J—1}-. (41) 


In order to interpret u, we study the total current 
density. Since each excitation contributes a current 
p/m the total current density contributed by the 
excitations is 


(40) 


mnt f plexplS(E(p) +p: (vs—u))]— 1) hp (20), 


At this stage we shall only consider the case of low 
macroscopic velocities. Expanding to the first order in 
in y,—u this may be written in the usual way as 


pr(u—Vs), 


where pp is defined as 


B 
ra f P(explBE($)]—1)—24%p(2m)*. (42) 
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To this we must add the current of the background 
poVs, So that the total macroscopic current density can 
be written 


j=palitp.vs, (43) 


if we put ps=po— Pn. 

In view of these separations we can say, artificially, 
that the liquid behaves as though there were two parts, 
superfluid at density p, moving at velocity v,, and 
normal at density p,» and velocity u (which we write 
hereafter as vn). The current is the sum of these two 
partial currents. In a similar manner the change of the 
internal energy, at constant entropy, produced by the 
velocities can be shown to second order to be the sum of 
the kinetic energies 4p,V.2+4pnVn?. 

In a vessel with fixed walls in thermal equilibrium, if 
the liquid background is flowing, its velocity v, must 
have no component normal to the wall. Further, the 
total current normal to the wall must vanish, so that 
the normal component of v, must also vanish at the 
walls. But in equilibrium v, is constant everywhere and 
must therefore vanish everywhere. We say the normal 
fluid is stationary in equilibrium with fixed walls, even 
though the superfluid moves with the velocity v,. 
Incidentally, the superfluid velocity is irrotational, 
VXv.=0. 

If the walls move together at constant velocity, then 
equilibrium results if v, is this velocity; the normal 
fluid moves at the same velocity as the walls. 

We extend Eq. (40) to situations slightly out of 
thermal equilibrium by assuming v, is not constant but 
varies from place to place. The failure of equilibrium 
will bring in various irreversible processes associated 
with the normal fluid, such as viscosity. If we leave 
these out of account, the remainder of the hydrodynamic 
equations which result can be derived in exactly the 
manner already given by Dingle.’ We need enter no 
further in this direction, as nothing new is gained. 
The resultant hydrodynamical equations can most 
easily be interpreted from the model of helium as con- 
sisting of two interpenetrating fluids. 

Nevertheless, it is difficult to understand these partial 
fluids from a detailed kinematic point of view. Kine- 
matically we have a general, or background fluid in 
which excitations move. The velocity of the superfluid, 
y,, is the general velocity of this background, but the 
density of superfluid is not pp. The velocity of the normal 
fluid, vz, appears as a parameter in the distribution 
function. It can be shown to be the average group 
velocity of the excitations. But the difficulties arise if 
one tries to interpret the formula (42) for pa from a 
direct kinematical point of view. It is not the average 
value of any quantity that can reasonably be ascribed 
to an individual excitation. It appears to have meaning 
only for the entire group of excitations in, or near, 
thermal equilibrium. 

The division into a normal fluid and superfluid, 
although yielding a simple model for understanding 
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Fic. 4. At intermediate temperatures the main excitations are 
rotons which carry an intrinsic momentum, indicated by the 
arrows. If they drift relative to the background fluid, they tend 
to oe upstream. This is illustrated for various values of 
v, the background fluid velocity, and v, the absolute drift velocity 
of the rotons. The total current is pov, plus the polarization current 
of the rotons. Although mathematically correct, the separation of 
this current into the two parts, p.v, and patn, characteristic of the 
two fluid model seems somewhat artificial from the microscopic 
viewpoint. 


the final equations, appears artificial from a micro- 
scopic point of view. This opinion is shared by Landau 
and by Dingle.* 

It is interesting to look at what is happening on a 
microscopic scale for various conditions of the velocities 
Ve, Vn. Consider a temperature not too low so that the 
predominant excitations are rotons. If the fluid is at rest 
a roton created with exactly the minimum energy A has 
no group velocity, but it has a momentum, of magnitude 
po pointing in some direction. We will call it the direc- 
tion of polarization and represent our roton by an 
arrow in this direction in Fig. 4. Not all rotons have 
exactly this energy A, but may differ by order kT from 
it, and have therefore nonzero group velocity. (Inci- 
dentally, the group velocity is parallel to or opposite 
to the polarization.) The rotons therefore may move 
about in a random manner like the molecules of a gas. 
Like gas molecules they can also have an average drift 
velocity relative to the fluid. Now we will assume [as 
required by (39), (41) ] that if the rotons are drifting 
relative to the fluid in a certain direction, they tend 
(as a result of collisions among themselves) in equilib- 
rium to polarize themselves in the direction in which 
they drift (that is, opposite to the velocity of the liquid 
moving past them). The general drift velocity of the 
rotons in space we call the normal fluid velocity, vn. 
The motion of the fluid as a whole we call the super- 
fluid velocity, v,. Let us consider some examples. 

First, with the fluid at zero velocity (v,=0), and no 
drift of the rotons (v,=0), they remain unpolarized 
(Fig. 4 (a)). If they are drifting to the right (v,>0) 
they will tend to polarize in this direction, lining up to 
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oppose the liquid passing them (Fig. 4 (b)). Now if the 
liquid is in motion (v,=»), and all the rotons drift with 
the liquid in the same direction (v,=v), there is no 
relative motion and no tendency to polarize (Fig. 4 (0). 
This situation is not in equilibrium with stationary walls, 
Collisions of the rotons with the walls will stop they 
drifting motion and they will remain at rest relative to 
the walls. However, they will become polarized opposite 
to the fluid passing them (Fig. 4 (d)). 

This interpretation of the velocities of the partial 
fluids of the two-fluid model is fairly simple and direct, 
It is otherwise with the current. The natural way to dis. 
cuss the current (or momentum) from the microscopic 
view is to split it into two parts. First the current 
produced by the flow of the moving fluid, pov,, and 
second the current produced by the polarization. It is 
then easy to see what the current is in each case. In the 
first two cases, 4a, 4b, we have no general fluid motion 
so that the current is all due to polarization, zero in 
case a, to the right in case b. If the fluid moves, but the 
rotons remain unpolarized, as in 4c, the current is py, 
purely due to liquid motion. When the rotons polarize, 
as in 4d, to oppose this background motion the total 
current is reduced. But this natural separation is not 
the same as that utilized in the two fluid view. It is 
difficult to identify, for example, what is called the 
current of the normal fluid. It is not current carried by 
the rotons as they drift from one place to another 
(which they do with the velocity v,) because the roton 
as such carries no mass but is only a disturbance in the 
liquid. Certainly the value of p, would be hard to obtain 
this way because the rotons contribute to the current 
mainly by their polarization, and not by their drift 
motion. (Of course in a case such as 4b the polarization 
is in the direction of the drift so we could say the drift 
acts as if it carries current, because it induces polariza- 
tion. p, is then the ratio of the polarization current to 
the drift velocity which produces it.) 

On the other hand, it is evident that the entropy 
flow is produced entirely by the drift motion (and not 
the polarization) of the rotons. Hence it is easy to see 
why all the entropy flows with the velocity an. 


THE CONSERVATION OF CURRENT 


In the last section we considered a packet of solutions 
(2) and found that we could picture an excitation in the 
form of a packet carrying a total current hk/m, and 
drifting at a group velocity @E/dp, but as not appreci- 
ably altering the density. But such a picture is incon- 
sistent with the conservation of matter. To take an 
extreme example, for a roton of the minimum energy 4 
the group velocity 9E/dk is zero, but the current 
hko/m is large. If such a current is distributed over 4 
finite region in such a way that the direction is every- 
where the same, we evidently cannot conserve material. 

On the other hand, it is well known that one can 
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demonstrate the conservation of matter, 
dp(a)/dt=V-j(a), 


from the Schrédinger equation. The reason that our 
wave function does not satisfy (44) is that it is not an 
exact solution of the wave equation. This shows an 
inaccuracy in our approximate wave function (17). 

One way that suggests itself to resolve it, in the case 
of rotons with k= p, is to propose a superposition of two 
waves with opposite momenta k and —k, like g(r) 
cos(k-r). In this case the current density is zero, and 
everything is all right. Furthermore, if the same small 
momentum I js given to each, so the momenta become 
k+l and —k+I with k=hp, the drift velocity JE/dk 
is the same for each partial wave, so that the packet 
stays together. 

On the other hand, with stronger collisions with walls 
and phonons perhaps the two momentum components 
would become separated. Further g(r) sin(k-r) is just 
as good a solution, and it must have almost exactly the 
same energy even if interactions are taken into account, 
because the exact position of the nodes in a large packet 
cannot be important. Therefore, a linear combination 
must again be a possibility and we are led back to the 
exponential, and to the difficulty of current conserva- 
tion. This lack of conservation is a symptom that all is 
not too well with our wave function. It is true that in 
the cosine case the symptom is hidden, but the con- 
clusion should stand that the wave function could be 
improved. 

The problem can be resolved by considering more 
complicated functions representing interaction of the 
excitation with the flow of fluid in its surroundings. One 
way the current could be conserved would be to have a 
general return flow of fluid in the region outside the 
packet. We therefore try the solution 


v= Dig(t,) exp(tk-r,) exp[7 js (1) 1, 


with the hope of finding an s which produces a velocity 
distribution v=mVs which shows such a reverse flow. 
Let us first consider such a packet in otherwise sta- 
tionary liquid. Then as a boundary condition s should 
go to zero as we go far from the packet. Substitution into 
the variation integral gives (32). For the current and 
density we use our approximations, that j(a) is given 
by (36), and p(a)=pp. There results 


(44) 


(45) 


sa B+ fi @)-Vs)eee— ff vs-vsats (46) 


where we have put, for the packet energy, E(k), which 
is nearly correct. Variation of s to find a minimum gives 
the equation 


V+ G+poVs) =0. 


This equation determines s if we impose the boundary 
conditions s—s0 far from the packet. Call this solution so 


(47) 
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and the velocity distribution vo=Vso/m. It is like the 
field produced by the charge density V-j, that is, at 
large distances the field of a dipole. It represents the 
back flow expected. Furthermore the total current 
operator has for our function the value 


Jo=j+p0V50 (48) 


so that (47) says that now the total current is conserved. 
There is a small shift in energy. Substitution of (47) 
into (46) gives the extra energy (reduction) 


fre -v(r)dr. 


If the order of the dimensions of the packet are L, the 
current hk/m is distributed over a volume L', so the 
velocities are of order hk/mL? and the kinetic energy 
(pol?k?/mL*) L3 varies as 1/L*, But to confine the packet 
to such a dimension wave numbers of order 1/Z in g(r) 
must be used, so we find from (18) (for the case k= kp) 
excess energies of order 1/Z? needed to confine the 
packet. Thus, for large packets, spreading the packet 
over even larger dimensions will decrease the energy, in 
spite of the energy of the currents we have just cal- 
culated. For extremely small packets our analysis does 
not hold because of the approximations made. 

Here we have just gone far enough to save the 
theorem of conservation of current. We have only dealt 
with the background current in a semiclassical way. 
More complex states consisting of superpositions of 
expressions like (45) should be considered if a correct 
calculation of the quantum-mechanical “self-energy” 
of a roton due to coupling with the general velocity 
field is to be carried out. Since the “self-energy” is 
negative, the corrected value of A will be nearer the 
experimental result. This problem is being studied. 

A more correct picture of a packet excitation, then, 
is that of a kind of region of polarization (that is, J) 
which induces a distribution of velocity field around it, 
poVsSo. The field is analogous to that produced by elec- 
trical polarization, the electric field E corresponding to 
the velocity field Vs/m, and the electric displacement 
vector D being analogous to the total current density, 
since its divergence vanishes, We can use this analogy 
to determine the behavior of the system directly, as it 
is easily verified that all the equations correspond. 
There is, however, one important difference. The signs 
of interaction are reversed. Thus (39) shows that rotons 
tend to line up opposed to the external field v, while 
electric dipoles line up with the field. The analogy must 
therefore be completed with the remark that the 
rotons are dipoles of a gravitational type; that is, like 
poles attract, unlike repel. 

The energy of a dipole in an external field is still the 
moment times the external field, even though the dipole 
itself creates some field of its own. None of the con- 
clusions of the previous section are changed, therefore. 
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ROTON INTERACTION VIA THE VELOCITY FIELD 


The fact that one roton creates a velocity field in 
which another may interact produces a kind of inter- 
action between rotons. This is possibly one of the major 
sources of interaction, especially for not too high roton 
density. It is interesting to try to see what effect it 
has. Suppose we consider rotons as small packets all 
separated from one another and acting as dipoles of 
strength p/m. Let us suppose there is an average 
polarization P per unit volume, and an average back- 
ground velocity v,. The mean current density is then 


J=pw.+P. (49) 


The actual field at any point is not v, because of the 
local variations produced by the individual dipoles. 
Call the w velocity that an average dipole feels both 
from the average effect v, and from its neighbors. The 
latter contribution is proportional to the polarization. 
In fact, as Lorentz showed for dipoles in random posi- 
tions, it is 3P/m, hence 


(S0) 


where"'a= 4, The case a= (is the case previously studied 
which neglects direct effects between the dipoles. 

The energy of a roton in this field is E(p)+- p- w. The 
statistical mechanics will then be governed by the 
function, 


w= vataP py}, 


fakT f In(1—expl—6(E(p) 
+p-w—p-u)]}dp/(2n), (51) 


Here u is zero for equilibrium with fixed walls, and is 
the normal fluid velocity, u=v,. The average polariza- 
tion then is given by 


P=m"0 f/dw. (52) 
The internal energy of the system is 
U= gmp.’ +mP-w— (am/2po)P?+(E(p)), (53) 


where the average value of E(p) is 


(E(p))= f E(p){exp(6(E(f) 


+p: (w—u))]—1}-'d*p/ (2)? 
= f+TS— (w—u)-mP. 


The entropy is S= —df/dT. 

Expanding up to second order in the velocities one 
finds that the current can still be written as p.V,+ pnt, 
and the excess internal energy (at constant entropy and 
total current) as $p,V,2+3p,u", provided that one writes 


1 Onsager has shown that if one deals with permanent dipoles, 
mutually impenetrable and roughly spherical, this value of @ is in 
error. If his analysis applies to our case, the value a= po(p0+2ps)? 
results [L. Onsager, J. Am. Chem, Soc. 58, 1486 (1936)]. 
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P:* Po Pa and 
Pn=Pn(1+apn"/ po), (54) 


where p,° is the old expression (42) valid for the case 
a=0. Therefore the expression for the velocity of 
second sound 


is unaltered when expressed in terms of pa, etc. Only 
the theoretical formula for pa is slightly modified. How- 
ever, the modification is appreciable only when p,°/po 
is not small, that is, near the transition. At the transi- 
tion where ¢2 goes to zero, p, must equal pp so that p,° 
given in (42) must equal 1/(1—a) or 1.5. Actually p,° 
varies very rapidly in this region so this makes no 
appreciable change in evaluating A and fo from the 
data, Furthermore in this region there may be other 
interactions which should alter our statistical me- 
chanical analysis anyway. (Actually we cannot even 
be sure that rotons act as small individual dipoles until 
we have improved the wave function to include the 
interaction with the velocity field, as a quantum field.) 

With interacting dipoles we would expect the ana- 
logue of a transition corresponding to the Curie point 
for electric dipoles. The analog of the condition for the 
Curie point comes out to be exactly the criterion that 
the expression (54) for pa (with p,»° substituted from 
(42)) becomes equal to pp. There are a few surprises 
here, though. Firstly, ordinarily the Curie transition 
occurs as we lower the temperature, but here it appears 
on raising the temperature. That is because the Curie 
point depends markedly on the density. Dipoles 
polarize if they are cold and dense. In our case at low 
temperatures they are cold, but not dense enough. 
As the temperature rises, the density does also, very 
rapidly, until a point is reached where spontaneous 
polarization appears even though the temperature has 
been raised. Another surprise is the fact that there is 
a transition even if the local field effect is neglected 
(a=0). This difference is a result of the change of sign 
of the forces. Our dipoles polarize most easily if arranged 
in a flat region, while for electrical dipoles a needle-like 
region is preferred. With all the dipoles polarized 
parallel in the sheet the outside field is zero if the 
internal field opposed the polarization. But this oppo- 
sition of polarization and field is just the stable condi- 
tion for the rotons. In fact, the mutual local field @ 
tends to depolarize them and raises the transition 
temperature. (For Onsager’s value of a, (54) shows no 
transition for any temperature.) 

One might be tempted to speculate that we could 
carry the statistical mechanical analysis right up to the 
transition point and beyond, by simply assuming that 
the only interaction of importance among rotons is the 
coupling with the general velocity field. One would just 
hope that other interactions or limitations to the 
number of degrees of freedom are not as important aS 
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one would otherwise guess, Aside from the amusing 
twist that helium I would then be the polarized, 
organized state, serious difficulties arise. One can 
analyze these things from the statistical formulas, if 
the velocities are not considered small and are not 
expanded. One can, without loss of generality, take 
states of total current zero, and for simplicity take a=0. 
What happens is this. For any temperature below the 
transition there are two equilibrium states possible, one 
unpolarized with v,=0, and the other polarized at 
finite v,. Since the latter has higher free energy it is an 
unstable equilibrium, but the v,=0 is stable (actually 
only metastable’”). As we approach the transition point 
the polarization of the unstable state approaches zero. 
Above the transition point (more correctly, the point 
when Papo) only the v,=0 state is in equilibrium and 
that is unstable. The instability arises this way. There 
is a high density of rotons, If a little polarization 
develops, their energy is reduced. This increases the 
number of rotons in equilibrium at a fixed temperature 
as well as the polarization, so that if there is no limit 
to the number of rotons there is no stable state. 

On the other hand, if a limitation of roton number 
such as (28) is imposed, stable polarized states exist at 
the higher temperature. But the transition to that state 
occurs as a first-order transition, and there is another 
transition at still higher temperatures when the polar- 
ization disappears again. 

It is therefore evident that we do not correctly 
describe the region very close to the transition by the 
usual energy expression (53) (with or without a=0). 
The interactions between rotons is playing a more 
complicated role than (53) can describe. 

In a previous paper an expression was given for the 
partition function which was presumably reasonably 
satisfactory right across the transition. However, the 
analysis was too difficult to carry out. Now that a more 
detailed picture of the behavior below the transition is 
available it may be easier to see how that expression 
can be treated. We still lack a clear picture of what 
happens in the few tenths of a degree on either side of 
the A point. 


INTERACTIONS BETWEEN EXCITATIONS 


Interactions between the excitations will lead to 
various irreversible processes, such as viscosity, attenua- 
tion of second sound, etc. These questions have been 
studied by Landau and Khalatnikov.. The most 
important factor in the various mean free paths which 
are involved is the change in density of the phonons and 
rotons with temperature. The absolute cross sections 
depend on the details of the interactions. Interactions 


2 That is, the conventional free-energy expressions for arbitrary 
v; are, strictly speaking, not self-consistent. For any temperature 
there is always some value of v, for which the free energy is less 
than its value for 7.=0. 

31. Landau and I. Khalatnikov, J. Exptl. Theoret. Phys. 
(U.S.S.R.) 19, 637, 709 (1949). 
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between phonons can be thought of as arising from a 
nonlinear equation of state. An interaction between a 
phonon and a roton would result if rotons have a 
different energy for different pressures. According to 
the theory presented here their energy is ?k?/2mS (k). 
If the liquid is compressed k? increases. On the other 
hand S(k) probably increases even more rapidly from 
the increase in local order produced by squeezing the 
nearly impenetrable atoms into a smaller space, There- 
fore we expect A to decrease with pressure. This pro- 
vides a mechanism for roton-phonon interaction. It also 
has other effects. The presence of a roton would cause 
a small increase in density in its neighborhood with 
the effect falling off inversely as the distance from the 
roton. This provides a mechanism of long-range inter- 
action between rotons in addition to that due to 
coupling with the general velocity field. The roton- 
roton interaction at short distances is a more difficult 
problem, which probably cannot be adequately solved 
until a more accurate wave function is available for 
the roton state. 

If the roton energy A decreases when the liquid 
density increases then we would expect that in equilib- 
rium the liquid would shrink if the number of rotons is 
increased. The volume decrease as the A point is ap- 
proached ts probably a consequence of this effect. The 
fall of the \ temperature with rising pressure is thermo- 
dynamically related. It may also be seen directly from 
(42) (supposing the \ point to be px=po) considering 
that A decreases as the density rises. 


SUPERCONDUCTIVITY 


It has been suggested that superconductivity is 
analogous to superfluidity. What can we learn of the 
former from our study of the latter? It is interesting 
that if the He atoms were charged (and their net charge 
canceled by a uniform fixed background charge of 
opposite sign) the liquid would imitate many of the 
features of a superconductor. In a magnetic field at 
absolute zero the London equation, j=eA/mc, would 
hold. This is because stirring of atoms is equivalent to 
interchange so that in the lowest state the wave func- 
tion cannot vary if atoms are stirred, and the part of the 
current depending on the wave function gradient 
vanishes. Other states would take a finite energy to 
create, there would be states of permanent circulation 
in multiply connected rings, there would be a second- 
order transition, etc. How this close analogy is to be 
interpreted is not clear. The things which, it has been 
argued here, apply physically to helium cannot be 
justifiably taken over to the case of Fermi particles, 
or such particles in interaction with lattice waves, 
without a complete investigation of their validity in the 
new environment. For example, at present there is 
not, in the author’s opinion, justification for assuming 


“FF, London, Superfiuids (John Wiley and Sons, Inc., New 
York, 1950), Vol. I. 
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that the form (2) is a reasonable wave function for an 
assembly of Fermi particles, with ¢ the ground-state 
function for such particles. In fact, there are definite 
arguments against it. Possibly the close analogy should 
only be used to tell us what the problem of supercon- 
ductivity is. It is, from this point of view, the problem 
of showing that, in the metal, aside from phonons there 
are no (or only very few) states of very low energy just 
above the ground state. 


DISCUSSION 


We still have left unsolved at least three basic ques- 
tions. One is to find a clear description of the neighbor- 
hood of the transition. A second is to obtain a more 
perfect roton wave function. The third is to describe 
states for which the superfluid velocity is not vortex- 
free. So far we have VXv,=0. At high velocities when 
more energy is available, more complicated motions 
might be excited. The evidence of high resistance to 
capillary flow at the higher velocities indicates this. A 
new element must presumably be added to our picture. 
We hope to publish some views on this third problem 
at a later time. 

We have limited ourselves to a qualitative analysis 
of the more curious features of the behavior of liquid 
helium. The problem of obtaining S(z) or the correla- 
tion function for the ground state quantitatively from 
first principles is beyond the scope of this work. 

It has been argued! that He’ atoms in low concentra- 
tion, in He‘ would act as a gas of free particles, but 
with an effective mass m”’ higher than that of one atom. 
This mass m2’ is calculated in the appendix, where it is 
found to be about six atomic mass units. 

The author has profited from conversations with 
R. F. Christy and with Michael Cohen. 


APPENDIX 


According to II an impurity atom of He (at infinite 
dilution) should behave as an essentially free particle 
except that its effective mass m’ should exceed the 
true mass of He* due to the inertia of the He* atoms 
which must make way for it as it moves. We shall 
calculate this excess mass here. First we suppose the 
impurity atom had the same mass m as the other Het 
atoms (i.e., we neglect the difference in mass of He? 
and He’). 

The wave function for such an atom (coordinates ra) 
moving with momentum /k might be conjectured to be 
exp(ik- ra), where ¢ is the ground state of the system 
(which is the same as if all the atoms were identical). 
This, as a trial function, gives the variational energy as 
}?k?/2m and shows no mass correction. It does not 
represent with sufficient accuracy the other atoms 
moving back when atom A moves forward. This 
suggests the trial function 


v= exp(ik-r4) exp[#20.s(1:— ra) ]¢, (1-a) 
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where the velocity field Vs(r;— ra) represents a backflow 
which depends only on the distance from the impurity. 
To omit the term 7=A in the sum we take s(0)=0, 
Substitution into the variational principle gives 


B= (1#/2m) [ [H—Ik- Divs(e—ts) 
+300 505 (te 14): V5 (07-14) 
+30.Vs(r;— 14): Vs(r— ra) ova" y, (2-a) 


with g=1, as y is normalized. We write po(r,r,4) 
= pop(ri—ra), and ps(ri,rj,84) = pops(ti— Tasty Ta), and 
measure all distances from the point ra, so that (2-a) 
becomes 


2mh- = kh’ — 2k- fos (r)p(r)dr 


+ ff vs) v9(0ypatee teat 
+ f9s()-vsMP@er a) 


Then s is to be chosen to minimize this expression. 
The function p3(r,r’) is the probability of finding one 
atom at r and another at r’ if there is an atom at the 
origin (f(r) is just the probability of finding one at 
r if one is at the origin). We do not know what this 
function #; is, but in this problem we can approximate 
it by p(r)p(r’) (except at the origin). A given atom is 
surrounded by many (eight or ten?) nearest neighbors, 
and the distribution along one radius r and another 1’ 
must be nearly independent except for the relatively 
small solid angle where the atoms at rand r’ are close 
together. Even here, if the functions s(r), s(r’) are 
smooth enough the average of f3 over such angles may 
still give nearly the same result as p(r)p(r’). With this 
substitution our problem is that of minimizing 


2mirs8=| 8 fvsirp(er| 


+ fv (r)-Vs(r)p(r)d'r.  (4-a) 


The error made by this approximation is 


Aé= cejam) ff vs (r)- Vs) Eps (r,r’) 
— p(n) p(r’) J@rd’r’.  (5-a) 


The s(r) which minimizes (4-a) behaves as a dipole 
field (as k-r/r*) at large distances and this produces 
some convergence difficulties in the first integral in 
(4-a). They may be easily straightened out as follows. 
The minimum energy is only very slightly altered if the 
function s(r) is altered at very large distances in such 


ATOMIC THEORY 


a way that it falls off eventually more rapidly than 
1/7. For such a function the integral may be done by 


p 


anh? b= (+ [sevoimer)> 


arts, the integrated part vanishing, so we may write 


+ f 9500) -vs@e@er (6-a) 


But in this form all the integrals have a definite limit 
even if s(r) has no convergence factor (and therefore 
varies as 1/r?), We may therefore use (6-a) and avoid 
ambiguities from conditionally convergent integrals. 

The variational principle shows that s must be a 
solution of 


V-Lo(r)((1-A)k— Vs(r))]=0, 


where we have set 


(7-a) 


7 f s(t) Vp(s)@r=Bk. (8-a) 


Multiplication of (7-a) by s(r) and integration, using 
(8-a), tells us further that 


f Vs(t)-Vs(n)p(r)d'r=B(1—B)R, 


so (6-a) says 


RR We 


é=—[(1—6)?+g(1—8)]=—-(1—8), 
2m 2m 


and the effective mass is m/(1— 8), an increase over m of 
Am=Bm/(1—B). (9-a) 


Tf the direction of k is taken as the z axis, the solution 
of (7-a) may be written in the form, 


s(r)= (1—B)Rk(z—z0(r)/r), (10-a) 


where 2(r) is a function of radius r=(r-r)? only, 
satisfying 
d dv 


—( b(r)— )=2 
dr (: 20) a 


and such that » approaches r at large distances. This 
is easily solved numerically. We used the values of 
pr) determined by Reekie and Hutchison.‘ Starting 
for small r where p(r)=0, so that dv/dr=0, we chose 
vat some convenient value and integrated out to radii 
so large that p(r) was effectively its asymptotic value 
oo. Asymptotically » has the form c(r+B/r?) where 
B, ¢ are constants. Since the equation is homogeneous 
we may divide the entire solution by ¢ to obtain one 
with the correct asymptotic form. By substitution of 
(10-a) into (8-a) one can show, using (11-a), that the 


(11-a) 
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expression for (9-a) can be written 
Am/m= 4rppB—1, (12-a) 
where we have used the fact that 
f (po £(r))4ar*dr=1 (13-a) 
OF 


{the origin, where f(r) has a 6 function being excluded 
in this integral]. Actually it is difficult to obtain 
accuracy with this method because the asymptotic 
form of v is sensitive to the values of p(r) used. Those 
of Reekie and Hutchinson‘ extended only up to r=6A 
and had to be taken from a graph, so that (13-a) was 
not accurately satisfied without some small arbitrary 
readjustments of the values. 

On the other hand, the solution showed that s(r) was 
nearly proportional to z/r*. Since (6-a) is a variational 
principle we can therefore obtain a good value of the 
energy much more simply. We substitute the trial 
function, 

s(r)= Az/r, (14-a) 


directly into (6-a) and determine the parameter A to 
minimize &. This gives 


4r \?r p@ ol 
am/m=4(—») Lf mpl tnrar| ; 


or, with the data of reference 4, Am=0.70m, or 2.8 
atomic mass units. (The numerical solution of the 
differential equation gave the same result within its 
accuracy of about 10 percent.) 

It is difficult to evaluate the small correction arising 
from the term A& of (5-a), for f3; is unknown. If the 
atoms locally are nearly on a lattice, say face-centered, 
or body-centered, of cubic symmetry, A& vanishes 
with the trial function (14a). 

If the mass of the impurity atom is not four atomic 
units it is readily shown that Am is unchanged, pro- 
vided that the distribution p(r) of atoms around the 
impurity is assumed to be unchanged. This is also 
expected physically for the extra mass is due to the 
motion of the He atoms in the environment of the 
impurity. Therefore for a He* atom the effective mass 
should be 5.8 atomic mass units. The higher zero-point 
motion of a lighter atom changes f(r) by pushing the 
neighbors farther away, thereby raising Am a little 
(but this effect must be a fraction of a mass unit because 
the effective mass is almost certainly less than the 6.8 
which it would be if the He? had the larger mass of 
4 units). 

This result is not in good agreement with the deter- 
minations which have been made from experiment, as 
summarized by Daunt.!® These give values nearer 8 or 
9 atomic mass units. 


(15-a) 


18 J. G. Daunt, Advance in Physics (Phil. Mag. Supplement) 
6, 209 (1952), particularly p. 258. 
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1. Introduction 


Liquid helium exhibits quantum mechanical properties on a large 
scale in a manner somewhat differently than do other substances. No 
other substance remains liquid to a temperature low enough to ex- 
hibit the effects. These effects have long been a puzzle. It is supposed 
that they can all be ultimately understood in terms of the properties 
of Schrédinger’s equation. We cannot expect a rigorous exposition 
of how these properties arise. That could only come from complete 
solutions of the Schrédinger equation for the 1073 atoms in a sample 
of liquid. For helium, as for any other substance today we must be 
satisfied with some approximate understanding of how, in principle, 
that equation could lead to solutions which indicate behavior similar 
to that observed. 

Since the discovery of liquid helium considerable progress has been 
made in understanding its behavior from first principles. Some of the 
properties are more easily understood than others. The most difficult 
of these concern the resistance to flow above critical velocity. If we 
permit some conjectures of Onsager 1, however, perhaps a start has 
been made in understanding even these. The aim of this article is to 
describe those physical ideas which have been suggested to explain 
the behavior of helium which can most easily be related to properties 
of the Schrédinger equation. 


We shall omit references to the phenomena involved in the Rollin 
Temperature Physics 2 
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film. It appears that the film can be understood as being maintained 
by van der Waals attraction to the wall. The flow properties of the 
film are interpreted as a special case of flow properties of helium in 
leaks in general. 

The article falls naturally into two main sections. First, there are 
phenomena in which the superfluid velocity is irrotational. Here we 
can give a fairly complete picture. The second part concerns the case 
in which vorticity of the superfluid exists. Our position here is less 
satisfactory and more uncertain. It is described here in considerable 
detail because of the interesting problems it presents. 


2. Summary of the Theoretical Viewpoint 


The first striking way that helium differs from other substances is 
that it is liquid even down to absolute zero. Classically at absolute 
zero all motion stops, but quantum mechanically this is not so, In 
fact the most mobile substance known is one at absolute zero, where 
on the older concepts we should expect hard crystals. Helium stays 
liquid, as London? has shown, because the inter-atomic forces are 
very weak and the quantum zero point motion is large enough, since 
the atomic mass is small, to keep it fluid even at absolute zero. In 
the other inert gases the mass is so much higher that the zero-point 
motion is insufficient to oppose the crystalizing effect of the attractive 
forces. In hydrogen the intermolecular forces are very much stronger, 
so it, too, is solid. In liquid *He there is a further transition at 2,2°K, 
the A-transition, between two liquid states of different properties. A 
transition is expected (at 3.2°K) for such atoms if the interatomic 
forces are neglected, as Einstein 3 noticed. London * has argued that 
the j-transition corresponds to the transition which occurs even in 
the ideal Einstein-Bose gas. The inter-atomic forces alter the tempe- 
rature and, in a way as yet only imperfectly understood *® the order 
of the transition, but qualitatively the reason for the transition is 
understood. We will concern ourselves here, only with the liquid He 
II, below the A-point, and shall try to elucidate the qualitative reasons 
for some of its strange behavior. Also we explicitly limit our conside- 
rations to a liquid made purely of *He atoms so that the wave function 
must be symmetric for interchange of the atoms. We do not mean 
to imply anything about liquid helium He, nor about superconduc- 
tors, either by analogy nor by contrast. That is, we shall use the fact 
that the wave function is symmetric in many arguments without 
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stopping to inquire whether the symmetry is necessary part of the argu- 
ment. 

The central feature which dominates the properties of helium II 
is the scarcity of available low energy excited states in the Bose li- 
quid 7 8. There do exist excited states of compression (i.e.: phonons) 
but states involving stirring or other internal motions which do not 
change the density cannot be excited without expenditure of an appre- 
ciable excitation energy. This is because, for quantum energies to be 
low, long wave lengths or long distances are necessary. But the wave 
function cannot depend on large scale modifications of the liquid’s 
configuration. For a large scale motion, or stirring, which does not 
alter the density, only moves some atoms away to replace them by 
others *. It is essentially equivalent to a permutation of one atom for 
another ,and the wave function must remain unchanged by a permu- 
tation of atoms, because *He obeys the symmetrical statistics. The only 
wave functions available are those which change when atoms move 
in a way which is not reproducible by permutation, and therefore 
either, (1) movements accompanied by change in density (phonons), 
(2) movements over distances less than an atomic spacing, therefore 
of short wave length and high energy (rotons and more complex states), 
or (3) movements resulting in a change in the position of the con- 
taining walls (flow). We shall discuss these states in detail presently. 

The scarcity of low energy excited states is the seat of many of the 
phenomena in the liquid. This has been known since the work of Lan- 
dau who developed a theory of the liquid on the assumption of such 
scarcity. The specific heat is very low at low temperature and only 
rises rapidly above about I°K when enough thermal energy is available 
to excite an appreciable number of the higher energy states (rotons). 
There are so few states excited that the excitations may be localized 
in the fluid like wave packets. These move about, collide with each 
other and the walls, and imitate the appearance that in the perfect 
background fluid there is another fluid or gas. This “gas’’ of excitations 
carries all the entropy of the liquid, may carry waves of number den- 


* In the ideal gas the low excitations are those in which one or two atoms 
are excited to low states. These involve density changes and are more analagous 
to phonon states (but are even lower in energy than in the liquid because the 
ideal gas has infinite compressiblity, and therefore vanishing sound velocity). 
The interatomic forces in the liquid make it more imperative that if atoms 
are moved away from one point others move in to take their place, if high re- 
pulsive energies between nearby atoms are to be avoided. 
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sity (second sound, analagous to sound in an ordinary gas), finds it 
difficult to diffuse through long thin channels, tries to even up uneven 
velocity distributions among its roton ‘‘molecules” (viscosity), and 
acts in many ways as a normal fluid. Meanwhile the background in 
which the rotons travel, that is, the total body of fluid itself, can flow. 
It flows, at low velocity, without resistance through small cracks. 
The reason is that to have resistance, flow energy and momentum must 
go into heat, that is internal excitations (eg. rotons). The energy re- 
quired to form a roton is not available (at the necessary momentum 
change) unless the fluid velocity is very high. 

Actually it appears likely that helium in flow doesn’t form rotons 
directly at all. Resistance sets in at a relatively low velocity (critical 
velocity) because apparently a kind of turbulence begins in the per- 
fect fluid *. This cannot occur at lower velocities because energy is 
needed to create vorticity. And, if we accept Onsager’s suggestion, 
the vorticity is quantized, the line integral of the momentum per 
atom (mass of atom times fluid velocity) around a closed circuit must 
be a multiple of 4. Below the critical velocity not enough kinetic ener- 
gy is available in the fluid to produce the minimum vortex lines. 

We shall discuss first the way that the scarcity of states accounts 
for many of the properties of the liquid. Here we are summarizing 
work of many others, particularly Landau. It is thought best to reem- 
phasize this viewpoint, since it is the one which is directly supported 
by quantum mechanics, Furthermore, in this way we are starting over 
the more familiar ground. Next we discuss the quantum mechanical 
view of the reason for the scarcity of states. Finally in the second part 
of the paper we discuss the quantized vortex lines proposed by Onsager. 


3. Landau’s Interpretation of the Two Fluid Model 


One of the most fruitful ideas in interpreting the behavior of the 
liquid is the two fluid model. It was developed by Tisza ® from ana- 
logy to the structure of an ideal Bose gas. It is often spoken of as 
a vague association of two penetrating fluids. Landau 1° has inter- 
preted it in a definite manner. We review his interpretation here, al- 
though an excellent review by Dingle ™ already exists. He has strongly 
emphasized the fact that one might picture the helium as a background 
fluid in which excitations move. At absolute zero one has a perfect 


* The author now considers his statement (reference 7) that the reason for 
flow resistance ‘cannot very well be a kind of turbulence’, to be in error. 
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ideal fluid which may flow frictionlessly with potential flow. If heated, 
the heat energy excites the liquid. This it does by creating here and 
there within it excitations of some sort. These excitations can make 
their way from one place to another, collide with the walls and with 
each other, and give to helium some properties associated with the 
so-called normal fluid component, such as viscosity. Landau as a re- 
sult of his study of quantum hydrodynamicas was lead to suppose 
the excitations to be of two kinds. Of lowest energy are the phonons, 
or quantized sound waves, whose energy E equals pc where # is the 
momentum and c the speed of sound. Above these separated by an 
energy gap A are those of another kind, called rotons. At first he sup- 
posed the energy of these to be given by A + #?/2u if they have mo- 
mentum #, where uw is an effective mass, Later he found that this did 
not agree with the experiments of Peskhov on second sound, and he 
proposed instead the formula 


Erot = 4 + (b — Po)?/2u (1) 


where fp, is some constant. He went further and suggested that all 
these excitations really are of the same class and differ only in momen- 
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Fig. 1. The energy of excitations as a function of their momentum. Solid line 
as envisaged by Landau with parameters set to fit specific heat data; dotted line, 
an approximate curve derived from quantum mechanics. 

Excitations in linear section for low momentum correspond to phonons. Those 
near the minimum of the curve are called rotons, 


22 R. P. FEYNMAN 
tum. The energy E(p) of those of momentum #, depends only on the 
magnitude of #, rising at first linearly as fc, but later falling to a mini- 
mum at #, and rising again, as in Fig. I, solid curve. The curve in 
the vicinity of the minimum is given by (I). At the low temperatures 
encountered in He II only the states near # = 0, and those close to 
the minimum are excited. Therefore we do not have to know the rest 
of the curve accurately. Furthermore, the only important excitations 
are one of the two classes, phonons and rotons. 

Supposing the excitations to obey Bose statistics the number, at 
temperature T, of momentum in the range dp is, according to statisti- 
cal mechanics, 


Mp == (exp BE — 1)7} d3p(2ahi)s (2) 


with 6 = (kT)-1 and E = E(p). From this the average energy E(p) 
and the specific heat can be calculated. In agreement with experiment 
it begins at low temperature as T+ as expected, according to Debye, 
since only phonons are excited. At higher temperatures the higher 
energy roton excitations become excited, and the specific heat rises 
much more rapidly. The thermodynamic properties are in excellent 
agreement with the theory if 1 


c == 240 meters/sec 


Ale = 9.6°K 
pol == 2.0 An} 
p= 0.77 m 


where #2 is the atomic mass of helium. 

The hydrodynamic equations of the two fluid model arise as follows. 
Suppose the fluid at absolute zero has density g, and velocity »,,. 
In the first part of this paper we shall take v, to be irrotational 
Vv “wv, = 0. Later we discuss the problem of local circulation. The 
mass current density is 9,v, and the kinetic energy is 40,v,?. Suppose 
that as a result of a rise in temperature a limited number of excitations 
are formed in the fluid. Landau has shown that the energy to form 
excitations in a moving fluid is not E(p) but is 


K = E(p) +p.e, (3) 


This results from simple considerations of the relations in moving and 
still frames of reference. The mass current density equals the momentum 
density of the fluid since all of the atoms have the same mass, It now is 
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J = 00%, + <p> (4) 
where <p > is the mean momentum of the excitations per unit volume. 
Now the mean <p> depends on how the excitations drift. If they 
are in equilibrium with the fixed walls of the vessel the mean p is 
not zero. The energy is E(p) +p .v,. It is lower than E(p) for those 
excitations, whose momentum is directed oppositely to v,. Therefore 
in equilibrium more excitations align oppositely to v, than parallel 
to it. For this reason the mean p is directed oppositely to v, and for 
small v, is proportional to it, let us say <p> = — 9,¥v, This defines 
On Uf 0, is defined as 9, — 0, we have a total current 9,v, in a Situation 
in which the excitations are in equilibrium with fixed walls. The equi- 
librium is established by collisions of the excitations with the walls 
and with each other. 

The number of excitations of momentum p is again determined by 
(2) but now with £ given by (3) so that the average p is 


<p> = Splexp B(E(p) +p .v,) — 1) dp(2nk)-8 


or expanding to first order in »,, find <p> = — g,v, where 
on = —§ § pHexp (BE(A)) — I) exp BE(6) ap (2aty> (5) 


The density 0, determined from experiments in second sound is in 
reasonable agreement with this expression (evaluated with the con- 
stants given above it fits above I°K, but below I°K the values ,/i == 
2.3 A-1 and ys == 0.40 m fit better, and do not alter the good fit to the 
thermodynamic data). This explicitly shows that 9,, is a derived con- 
cept, and does not represent the density of anything which has micros- 
copic meaning. 

The excitations can drift also. The distribution for equilibrium in 
a drifting gas is, according to statistical mechanics, 


n(E) = (exp B(E —p. u) — 1)7} (6) 
where uw is a parameter. In this case the mean momentum is 


<p> = — On(Us —u). 
If we write w == v, we have for the current 
j mars QU - Olt, a) = OU, ay Onn (7) 


This can be interpreted macroscopically as saying that the current is 
like that in a mixture of two fluids, one of density @, moving at velo- 
city v,, the other of density 0, and velocity 1,,. 
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Actually (6) is not an equilibrium distribution unless the walls move 
at velocity uw, and furthermore u is constant throughout the liquid. 
It is generally taken as a good approximation in the case that w, that 
is, v,, is not constant. The lack of equilibrium in this case produces 
irreversible effects, such as viscosity, which can be associated with the 
“normal fluid component”. The distribution is in equilibrium even if 
v, is not constant. 

The entropy of the system is that of the excitations. It is easily 
verified that the mean group velocity of the excitations (the mean 
of 6 (E(p) +p.v,)/6p) is just v,. The entropy can therefore be con- 
sidered as flowing with the ‘‘normal fluid’’. 

It is also possible to work out the expected value of the energy of 
the system. If one calculates the internal energy and subtracts the 
internal energy the system would have at the same entropy but with 
Uv, = U, = 0 the excess expanded to the second order in the velocities 
can be written 40,v,? + 40,v,7. This is just what the two fluid 
model would expect. 

Therefore Landau shows that a liquid system with excitations as 
described will behave in many ways like a mixture of two fluids. 

Furthermore, considerable progress has been made by Landau and 
Khalatnikov 3 in the interpretation of many irreversible phenomena, 
such as viscosity, attenuation of second sound, etc. from the kinetic 
theory implied by such excitations. It is not possible as yet to find the 
crosssection for collision, say between two rotons, from first principles. 
But if a few such quantities are considered as unknown parameters, 
a great deal can be said. The number of rotons varies very rapidly with 
temperature, in the manner given by (2). For this reason the mean 
free path for collision and the resultant viscosity resulting from roton- 
roton collisions has a known temperature dependence. In a similar 
way the contribution of collisions between rotons and phonons or 
between phonons can be worked out. There are also collisions in which 
the number of excitations change. The results are often in excellent 
agreement with experiment. 

There is, therefore, little doubt that in liquid helium there are such 
excitations, with the energy spectrum that Landau suggests, and that 


this picture supplies the complete interpretation of the two fluid model 
for helium II. 
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4. The Reason for the Scarcity of Low Energy States 


The next question that concerns us is to try to see from first prin- 
ciples why the excitations of the helium fluid have these characteristics. 

Landau has, in fact, tried to obtain some justification for the spec- 
trum from a study of quantum hydrodynamics. This is not a complete- 
ly detailed atomic approach. One attempts to describe the liquid by 
a few quantities such as density and current, or velocity. Then one 
makes these quantities operators with reasonable commutation re- 
lations, and tries to find the excitation energies of the fluid. The pro- 
blem has not been analyzed in sufficient detail to establish the energy 
spectrum (1). Such an approach cannot give us an ultimate detailed 
understanding for two reasons. First, the numerical values of A, ,, 
show these quantities to be characteristic of the atomic structure of 
the liquid. A theory which describes the fluid simply by average varia- 
bles and which therefore cannot represent the fact that the liquid 
does in fact have atomic structure cannot lead to definite values for 
excitation energy. A more serious problem is this. It is necessary to 
show not only that the excitations E() exist, but that there are not 
a host of other possible excitations lying lower. If we describe the 
liquid with average variables we have no assurance that there are no 
excitations at a level below the coarsness of our averages. Possibly 
excitations exist which represent no gross density variation and no 
mean current. If many other lower excitations exist they dominate 
the specific heat curve and the properties of the fluid. (Perhaps in 
3He we have an example of a system capable of excitations at an atomic 
level which are not describable by the variables used in quantum 
hydrodynamics), 

However it is possible from first principles to see why there are 
no other excitations but those supposed by Landau and why the 
energy spectrum of these excitations has, qualitatively, the form 
which he supposed. ” 8 

In order to do so, we should, rigorously, have to solve the Schré- 
dinger equation for the system. 

H2 
Hy = me, Ys AVF y+ Fy V(Ry)y = Ey 
where m is the atomic mass, V(R,;) is the mutual potential of two 
atoms separated by the distance R,,; = R,;—R,, and 7,? represents 
the Laplacian with respect to the coordinates R, of the i" atom. 
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The sums must be taken over all of the N atoms in the liquid. We 
cannot solve this equation directly but we can make surprising head- 
way in guessing the characteristics of the wave functions wy which 
Satisfy it. 

We shall have to picture the wave function y. It differs from one 
state to another. But we will consider its value for only one state at 
a time. Then it is a definite but complicated function »(R,, R,... Ry) 
of the 3N variables R,. To picture it we must have a scheme by which 
we Clearly represent it in our minds. Now such a function is a number 
associated with every set R* of values R,, or, as we shall say, with 
every configuration of the atoms. We can represent a configuration 
R™ by imagining each of the N atoms in the vessel containing the 
liquid to be located with its center at one of the R,. That is, each con- 
figuration is represented, as classically, as a particular definite location 
for each of the atoms. Then y(R%) is a number associated with each 
such arrangement of the atoms. We can call it the amplitude of the 
configuration. For a given state, this amplitude for some atomic ar- 
rangements is large — these arrangements then have large probability — 
for others small and the configuration is unlikely. When we wish to 
speak of how the amplitude changes as the values of R, change, we 
shall use the more vivid language of asking how the amplitude changes 
as the atoms are ‘‘moved’’ about. Such motions are not directly related 
to any real classical motions, of course. In fact we cannot describe 
classical motions directly. All such classical ideas must be interpreted 
in terms of the mathematical behavior of y, if we are to be consistent 
with quantum mechanical principles. Most of our task, therefore, is 
trying to describe the y functions which correspond to the various 
kinds of states of energy, or motion, of which the liquid is capable. 

Start by considering the ground state wave function which we shall 
call © .We use the intuition which we have acquired from knowing 
the solutions of the Schrédinger equations for simpler systems. For 
stationary states, y can be taken to be a real number. The lowest state 
always has no nodes (except for the exclusion principle, which does 
not operate here). Therefore @ is everywhere positive. It is symme- 
trical, that is, ® depends only on where atoms are, not on which is 
which. The energy V(R) of interaction of two helium atoms, as wor- 
ked out by Slater and Kirkwood," for example, consists of a very weak 
attraction at large distances, but a powerful repulsion inside of 2.7 A 
(sce Fig. 2). The atoms in liquid helium at the normal density have 
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a volume of 45 cubic Angstréms each so they arc not tightly squeezed 
together. If one wishes a rough approximation, consider the atoms 
as impenetrable spheres of 2.7 A diameter, and forget the attraction, 
whose effect is, after all, mainly just to hold the liquid together at 
the normal density even if the external pressure falls to zero. Then 
configurations of atoms in which some overlap each other, that is, 
are closer than about 2.7 A, are of very small amplitude. In the most 
likely configurations the atoms are well spaced. As for a particle in 
a box whose wave function bows highest in the center and falls gra- 
dually to zero at the walls, we may imagine the amplitude highest 
for good separation and falling toward zcro if a pair of atoms approach 


POTENTIAL (°K) 
Q 


-10° 


Fig. 2. The potential of interaction of two helium atoms as function of their 
separation as worked out from quantum mechanics by Slater and Wirkwood. 


too closely. Our structure is a liquid, as a consequence of the zero- 
point energy, so that no particular lattice arrangement is strongly 
preferred. All configurations for which the spacing is ample have high 
probability. We can get from one arrangement to another without 
ever crossing a forbidden configuration of overlapping atoms because 
of the large spacing (cube root of atomic volume is 3.6 A). Although 
not crystalline, there is a little local order induced by the tendency of 
atoms to stay apart, so that X-ray or neutron scattering experiments 
show a structure very similar to that of other simple liquids like liquid 
argon. 

For the configurations of high amplitude the density is fairly uni- 
form, at least unt:! we look over such smal! volumes that we can see 
the fine grain atomic structure. If the density in a region is raised the 
atoms come closer together so that the “bow” on the wave function 
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which occurs as one atom is moved from contact with a neighbour on 
one side to one of the other side, is confined to a smaller space. The 
increased curvature represents increased kinetic energy and it is not 
as likely to find a configuration in which such an energy barrier is 
penetrated. As a matter of fact, this feature is easily analyzed quanti- 
tatively. Long range density fluctuations are sound waves. The rise 
of energy on compression is described by the compressibility coeffi- 
cient, or equivalently by the speed of sound. Classically, standing 
density waves oscillating as a normal mode behave as an harmonic 
oscillator. Likewise, in quantum mechanics these are quantum os- 
cillators and have zero point motions, although the most likely con- 
figuration is that of uniform constant density. The wave function for 
the zero-point motion of an oscillator is a gaussian so that the ampli- 
tude ® for a given kind of density fluctuation falls off exponentially 
with the square of the fluctuation. To summarize, the ground state 
function is large for any configuration in which the atoms are well 
spaced from one another at nearly constant average density. If falls 
off if these conditions are violated. 

Next we turn to the excited states. Right away one obvious exci- 
tation is that of the standing sound wave. If the classical frequency 
is w the quantum excitation energy of such a mode is Aw. Usually 
one prefers by linear combinations to make states of running waves, 
or phonons. If the wave number is &, the energy is fike if c is the 
sound velocity. 

We may readily obtain the wave function for such a phonon exci- 


tation, If the density is o(R) the classical normal coordinate going 
with such a mode is 


qk = Jo(R) exp (1k. R) PR (8) 


Quantum mechanically for an oscillator the wave function for the 
ground state is a gaussian, and the first excited state is just the coor- 
dinate times this gaussian (the first hermite polynomial /,(x) is just 
x). Hence the wave function is 


Wphonon ame QP (9) 


if ® is the ground state wave function of the system, which we have 
described in the preceding paragraphs. We have not bothered to nor- 
malize our function. The liquid consists of many atoms so if R, is 
the position of the 7“*, the density in any configuration is 
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o(R) = +,6(R — R,) (10) 


the sum extending over all the atoms, Putting this in (8) and then (9) 
we find 


Yononon == (2, exp. tke. R,) @ (11) 


This is valid if the wave length (2x/k) is much larger than the atomic 
spacing, for then our description by compressional waves is adequate. 
The state energy is kc. Since Ak is the momentum p of the state, this 
means E = pc. Since the wave length can be very long this energy 
can be exceedingly low. 

The central problem is to see why no states other than these pho- 
nons can have such low energies. We try to construct the wave func- 
tion y of an excitation which should be as low in energy as possible 
and yet not represent a phonon. We must associate a number which 
may now be positive or negative with each configuration. In fact, 
since y must be orthogonal to the ground state ® which is everywhere 
plus, y must be plus for half the configurations and minus for the 
other half. Furthermore, y must be orthogonal to all the phonon states. 
This simply means that y must vary from plus to minus for changes 
in the configurations which do not appreciably alter the large scale 
density. Configurations can alter without variation of mean density 
by simply stirring the atoms about. Of course, since y must represent 
as low an energy as possible we must give low amplitude to configu- 
rations in which atoms seriously overlap, just as in the ground state ®. 

The function y takes on its maximum positive value for some con- 
figuration of the atoms. Let us call this configuration A, and the parti- 
cular locations of the atoms «-positions. We said that the «-positions 
must be well spaced so that the atoms do not overlap, and further 
that they are, on a large scale, at roughly uniform density. Equally, 
call configuration B, with atomic positions f, that for which w has 
its largest negative value. Now we want B to be as different as pos- 
sible from A. We want it to require as much readjustment over as 
long distances as possible to change A to B. Otherwise py changes 
too rapidly and easily from plus to minus, our wave function has a 
high gradient, and the energy of the state is not as low as possible. 

Try to arrange things so that A requires a large displacement to 
be turned into B. At first you might suppose it is easy. For example 
(see Fig. 3) in A take some atom in the left side of the box containing 
the liquid and move it way over to the other side of the vessel, and 
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call the resulting configuration B. One objection to this is that an atom 
is moved from one side to another, so a hole remains at the left and an 
extra atom is at the right. This represents a density variation. To avoid 
this we may imagine that another atom has been moved at the same 
time from right to left, and the various holes and tight squeezes have 
been ironed out by some minor 
adjustments of several of the neigh- 
bouring atoms. This movement of 
two atoms each a distance of the 
size of the vessel. one from left to 
right and the other from right to 
left, is certainly a long displacement, 
so B and A are very different. But 
they are not. 

The atoms must be considered as 
Fig. 3. Two configurations (solid identical, the amplitude must not 
eee (oe eas ore depend on which atom is which. One 


avon ean acmually A accom- cannot allow wp to change if one 
plishe y much smaller adjust- P 
ments (short arrows) because of simply permutes atoms. The long 


the identity of the atoms displacements can be accomplished 
in two steps. In the first step per- 
mute the atoms you wish to move to those «-positions closest to 
the ultimate position they are to occupy in the final configuration 
B, This step does not change yp because all the atoms are still 
in the same configuration of «-positions. Then the change to 
the B configuration is made by small readjustments, no atom mo- 
ving more than half the atomic separation. In this minor motion »y 
must change quickly from plus to minus and the energy cannot be 
low. For the reason that the wave function is unchanged by permu- 
tation of the atoms it is imposible to get a B configuration very far 
from the A configuration. No very low energy excitations can appear 
(other than phonons) at all. 

In the phonon case we consider configurations in which, as y changes 
sign, the density distribution changes. A change in density cannot be 
accomplished by permuting atoms. That is why the Bose statistics 
does not affect phonon states. But it leaves them isolated as the lowest 
states of the system, so the specific heat approaches zcro as T appro- 
aches zero according to Debyes 73 law. This is the key argument for 
the understanding of the properties of liquid helium, It is given in 
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somewhat more detail in reference 7. Since it is a negative argument, 
attempting to prove that a low energy state does not exist, it is difficult 
to convey conviction in a few words. The reader should try to invent 
wave functions of low energy for himself. After a few attempts he will 
see much more Clearly what we have tried to explain here. 


5. Rotons 


The qualitative argument is complete in itself. Nevertheless it is 
gratifying that it may be pushed even further to produce a quantita- 
tive estimate of the energy of these other excited states. We give only 
a summary of the considerations here (see reference 8 for details). 
We try to clarify our picture of the wave function y, until we can 
write a mathematical expression for it. This expression put into the 
energy integral fy*Hyd' V/fy*yd" V will give us an estimate for the 
energy. 

As we said, in order to get the energy as low as possible we wish 
the gradients of wy to be small. Therefore the configuration B (where 
y is maximum negative) must be as 
far as possible from configuration 4. 
Yet we noted that no f-site is more 
than half the atomic spacing from 
an a-site. The two configurations are 
generally nearly the same. They are 
furthest from each other if as many 
atoms as possible must be moved. 
That is accomplished when, as illus- 
trated in Fig. 4, all the f-sites are Fig. 4. The excited state wave 

: function must be positive for one 
between «-sites, so every atom must configuration, solid circles (x-posi- 
moves Te complet Sy SCN VCS Viney are separator a6 uf a9 base 
course, we must give its value for all ble ‘if the negative configuration 
configurations, not only for A and B-_ leaves no atom unmoved, dotted 

: circles (8-positions) 

when all atoms are on a-sites, or all 

on f-sites. The lowest energy results if the transition from plus to 
minus (hence A to B) is as gradual as possible. First for configurations 
in which each of the atoms is either on an « or on a f-position, this 
is most naturally accomplished if y is proportional to the number 
on «-sites minus the number on f-sites. This difference passes smoothly 
from plus to minus. It can be expressed mathematically this way: 
Consider a function, /(R), of position, which is + 1 if R is at an a- 
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site and — | if Ris ata f-site. Then 2,/(R,) summed on all the atoms 
is just the desired number on «-sites minus number on f-sites. For 
intermediate positions y will vary as smoothly as possible if /(R) 
is taken to vary in some smooth way between its extreme values of 
-. | and — I, which it takes on at « and f-sites. This suggests that 
we take w to be of the form 


p= Lf (R,). 


But this is incomplete for we tacitly assumed that in all the confi- 
gurations the atoms did not overlap, the mean density did not vary 
very much and so on, just as in the ground state. This feature can be 
taken into account if we take instead 


Proton = Xf(R;) ® (12) 
where ® is the ground state function. © Then y will fall rapidly if the 
atoms overlap, etc. We actually do not know what the function /(R) 
is but we expect it to vary rapidly, so that if expanded in a fourier 
integral the dominant wave lengths would be the atomic spacing. 

According to the variational principle the best wave function is 
that which minimizes the energy integral. In this way, by variation 
of /(R) it is readily found (see reference 8) that the minimum results 
if the function is 


{(R) = exp (ik. R) (13) 
and that the corresponding energy is 
E(k) == h?k?/2mS(k) (14) 


where m is the atomic mass. The function S(k) is the form factor for 
the scattering of neutrons from the liquid. That is, it is the Fourier 
transform of the function #(R) which gives the probability per unit 
volume of finding an atom at a distance R from a given atom in the 
liquid in the ground state. 

The local partial order of the liquid in the ground state shows up 
as in other liquids as a ring in the diffraction pattern (of neutrons, 
or X-rays). That is to say, there is a maximum in the function S(R), 
which occurs when & represents a wavelength near the nearest neigh- 
bour spacing. The maximum in S(k) represents a minimum in E(A) 
here. This confirms the expectation that the low excitation would 
have wave numbers in this vicinity. 

The state (12) and (13) has the momentum p = #k, Ordinarily not 
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every value of a parameter in a wave function has significance in the 
variational method. But states of different momenta are orthogonal, 
and the energies (14) are significant not only for k near the minimum, 
but also in the neighbourhood of this value. The range of values for 
which (14) is useful is limited only by the range for which (12) can be 
expected to be a good wave function. For small k, (12) is identical 
to the wave function (II) representing phonon excitation, and (14) 
can be shown to give ike in that region. Therefore the expression 
should be reasonable not only for & near the reciprocal atomic spacing, 
but for low & as well. It predicts a spectrum at first linear in p (== Ak) 
then falling to a minimum, just as anticipated by Landau, and in 
agreement with experiment. 

The curve S(k) taken from neutron data of Henshaw and Hurst 3, 
or from the X-ray scattering data of Reekie 1” agree. The E(k) which 
results is shown in Fig. | by the dashed line. The general behavior 
and minimum are clearly shown, 

The actual value of the energy at the minimum is twice too high 
to agree with the experimental value (solid line) for 4, The theoretical 
value lies above the true value, as it should according to the variation- 
al principle. 

The inaccuracy of the wave function (12) prevents us from giving 
a complete description of what the roton wave function must look 
like. The function (12) does not satisfy the conservation of current. 
It appears as though a more accurate function would represent a 
current distribution large and unidirectional in one region, with a 
field of return currents surrounding it, somewhat in the nature of 


a smoke ring. These and other arguments suggest a trial function of 
the type 


p= XL, expik.R,.expi2,g(R;—R,) @ (15) 


with the g, representing the back flow, to be determined. It is very hard 
to perform the integrals required in the variation problem with (15), 
so it has not been verified whether (15) represents a substantia! im- 
provement. 

One way to understand the low energy for k near the reciprocal 
atomic spacing is this. One might consider these as sound waves of 
very short wave length. To obtain a density variation of long wave 
length is hard. To make the compression work must be done against 


opposing forces. For wave lengths closer to atomic spacing, however, 
Temperature Physics 3 
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such density variations are easier to arrange. In fact, one can create 
variations of wave length equal to the atomic spacing simply by ar- 
ranging the atoms, doing no appreciable work against repulsions, the 
energy being purely kinetic #?k?/2m. Actually the energy is even lower 
(S(k) at maximum is 1.3) for there is a positive tendency in the liquid 
to have such variations; if some atoms are correctly arranged the 
others are more likely to be also satisfactory because of the local 
order. Therefore the energy does not continue to rise as /ike but falls 
lower for wave lengths near the atomic separation. 

It is easy to verify that these excitations behave in just the way 
that has been assumed in developing the statistical mechanics and the 
two fluid model. To represent a state with two excitations, say with 
momenta k, and k, one has the approximate wave function 


yp == (2, exp tk,.R,) (2, exp (tk, .R,;)) ® 


and so on. Since the order of the factors is irrelevant this is the same 
state if k, and k, are reversed. The excitations obey the Bose statistics. 
In moving fluid the energy of the excitations can be shown to be (3). 


6. Irrotational Superfluid Flow 


So far we have only described the wave function for states repre- 
senting internal excitation. We turn next to a description of the wave 
function which represents the state of the fluid when macroscopically 
we say it is flowing. We will assume that the flow velocity does not 
vary appreciably over distances of the order of an atomic spacing. 

It is not dfficult to represent by wave functions states which re- 
present the motion of the superfluid. Suppose the system is at absolute 
zero so there are no excitations. If the entire system moves forward 
as a body, since the center of gravity coordinate can be separated 
out from Schrédinger’s equation, the wave function is 


p == (exptk .(2,R,)) ® 


where N&k is the momentum of the system, if there are V atoms. In 
case the velocity is not uniform we can construct a wave function 
somewhat as follows: If the velocity varies only slowly from place 
to place, those atoms temporarily in a macroscopic region where the 
velocity is, say, ¥ must surely have a wave function very much 
the same as though the liquid in the region were isolated and moving 
at a uniform velocity. This suggests that the phase contains a term 
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tmv . 2 R,, the sum being taken only over those atoms in the region. 
Other regions where v differs make similar contributions to the phase 
so the total phase is mZ,v(R,) .R, where v(R) is the velocity at R. 
This suggests a wave function of the form 


Vriow = exp 1(2,8(R,))] ® (16) 


where s(R) is a function which varies only very little over distances as 
small as the atomic spacing. We have suggested that it is #-mv(R) . R, 
but as is usual for waves whose wave length varies with position, 
the momentum is the gradient of the phase, not the coefficient of R. 
Thus (16) does represent the helium flowing, but the velocity is given by 


v = hm ps. (17) 


It is readily verified that the current density is 9,v, and the energy 
(from the variational integral) is $0,v?, as expected classically. There 
is no change in density, as in (16) we have not allowed these small 
effects to be represented. 

If excitations exist in the moving fluid the wave function is (16) 
multiplied by the factor »,f(R,) in (12). The excitation enetgy turns 
out to be (3) as expected, interpreting v as the superfluid velocity 
Vy. 

Equation (17) implies that the motion is irrotational, that is, 
Vy x v, = 0, In a simply connected region this has only one solution 
for given motion of the boundaries. For fixed boundaries it is v, == 0. 
In a multiply connected region the situation is different. Since 7 x 
v, = 0, the circulation about any closed curve which can be shrunk 
to a point is zero. On the other hand, in the case of a toroidal region, 
if the curve encloses the hole the circulation need not vanish. Although 
the wave function must be single valued, s may be of the nature of 
the azimuthal angle, increasing by 27, or a multiple thereof if one goes 
around the hole. That is, for a circuit enclosing a hole (into which 
liquid may not freely flow) the circulation must be an integral multiple 
n of a quantized unit 2ah/m, 


de, . ds = 2nhim-) .n = 2an .1.5 x 10-4 cm?/sec (18) 


These states do not influence the previous statistical mechanical ar- 
gument. There are too few of them. The velocity may be considered 
as a macroscopic variable, such as density. For a macroscopic torus 
even the lowest of the states given by (18) is very much higher than 
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a roton energy A. Thus if the torus area is A, radius R, the mass 
moving is mA . (2xR)/d3 where d? is the atomic volume. It moves 
at velocity given by v,.27R = 2xfim-, from (18), so the kinetic 
energy is (#?/2md?) . (221A /Rd). The factor h?/2md? is an energy of the 
order of a roton, but the second factor is very large, being the torus 
dimension over the atomic spacing. Incidentally the total angular 
momentum is # per atom. 

If the fluid must flow irrotationally, at first sight, it cannot lose 
energy, unless it is moving very rapidly. This has been pointed out 
by Landau. If a body of fluid is moving at velocity v, and loses a small 
energy 6£, it must do so (to keep the flow irrotational) by the entire 
fluid changing its velocity. Let the change in v be dv. If M is the effec- 
tive fluid mass the momentum change 69 is Mév and 6E=Mv6dv=v6p. 
Now this energy loss must go into heat; that is, into internal exci- 
tations of rotons. But if the momentum transferred to excitations is 
6p the energy cannot be small. It must be at least about (6f/p,)4 
where A and f, are the energy and momentum of an individual! roton. 
That is, dE must be at least (4/p,)6f and energy cannot be lost unless 
v exceeds A/p,, about 70 meters per second. (More accurately v must 
be high enough that a line drawn from the origin at slope v can cut 
the E(p) vs p curve). This suggests the reason for the frictionless flow 
of superfluid. But we have proved too much, for in actuality the 
resistance sets in at velocities a few hundred times smaller. 

The only way that gross slowing down can occur for lower velocities 
is for small parts of the fluid to stop or slow down without the entire 
fluid having to slow down at once. That is, energy loss must be accom- 
panied by flow which is not irrotational; that is, flow which involves 
local circulation. To understand such effects we must add a new element 
to our picture of phonons, rotons and potential flow. These are the 
quantized vortex lines suggested by Onsager.' We proceed to des- 
cribe them. 


7. Rotation of the Superfluid 


The problem which now faces us is to extend (16) so that we can 
also represent states for which 7 x v does not vanish, or at least 
where there is circulation in the superfluid. We analyze the situation 
at absolute zero for simplicity. We must present ourselves a problem 
in which such circulation is necessary and try to find the lowest energy 
state. The situation first considered by the author was the slip-stream 
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between two regions of fluid moving at different velocity, but it is 
easier to arrive at the result by considering the problem of helium with 
high angular momentum in a cylindrical vessel. Suppose, for example, 
the helium at absolute zero is initially under such pressure that it is 
solid and is set into rotation, then the pressure is released so that it 
liquifies. What is the final state of the helium? We ask then for the 
lowest state of a quantity of helium which has a definite, niacroscopic- 
ally high, total angular momentum. 

For a system of given angular momentum the kinetic energy is 
least if the angular velocity w is a constant throughout the liquid. 
This motion is not rotation free for 7 x v = 2m. But it is very difficult 
for helium to manage a state of local circulation. In fact, without 
high excitation energy, local circulation is impossible. At first one 
might find it hard to see why the liquid cannot simply rotate as a 
rigid body. The energy is then low. But a liquid is not a rigid body. 
A part of it can turn independently of the whole. In a rough way of 
speaking the liquid may be thought of as made up of many quasi- 
independent units of nearly atomic dimensions. Any motion of the 
body can be compounded of motions of the tiny parts. But to set any 
small part into a rotational state requires a high energy because the 
moment of inertia is so small. If only a limited energy is available 
nearly all the “‘parts’’ must be frozen out in their ground states. That 
is, nearly everywhere the local angular momentum is zero, i.e., 
V X v, = 0. It takes energy to create circulation and, furthermore, 
we can expect this circulation not to be distributed uniformly through- 
out the fluid. The rigid body type of rotation where p x v, 40 
everywhere is not possible, or if at all, only with an enormous expen- 
diture of energy, an expenditure far higher than that gained by the 
uniform distribution of angular velocity. 

Another possibility that suggests itself is that the liquid, if the an- 
gular momentum is high, is not free of excitations like rotons and 
phonons even though the temperature is at absolute zero. These ex- 
citations could carry the angular momentum. That is, in the language 
of the two fluid model, perhaps there is at 7 = 0 a mixture of superfluid 
and normal fluid, with the superfluid component not rotating, and 
with the normal fluid carrying all of the angular momentum. The 
energy to maintain the normal fluid being sustained by the fact that 
if less normal fluid were present, for given angular momentum the 
kinetic energy would have to be larger. This turns out, for vessels of 
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centimeter dimensions turning at about one radian per second, to 
be a state of nearly I10* times the energy of a rigid body rotating at 
the same angular velocity. Surely nature can find some lower state 
for the helium. 

We know (see 18) that if there is a hole in the liquid, circulation 
can exist. Therefore another solution suggests itself. The liquid cir- 
culates around a hole with constant circulation as in a free vortex 
(familiar from rotation of water around an emptying drain). The velo- 
city varies inversely as the radius, rising to such heights near the cen- 
ter as to be able to maintain the hole free of liquid by centrifugal force. 
Such a solution would be easy to verify in a striking manner by looking 
at the surface of the liquid. Instead of the usual parabola it would be 
the curve of the surface of a free vortex. The energy is still quite a bit 
higher than the rigid body case, because the velocity instead of being 
distributed proportionally to the radius, actually falls as the radius 
increases. Nevertheless it is orders of magnitude below the mixture 
of normal fluid suggested above. 

However, this is still not the lowest possible energy state, and the 
striking experiment will not succeed. To show this we construct a 
lower state. Suppose that the liquid has not only one vortex at the 
center, but several vortices. For example, suppose beside the central 
one there were a number distributed about the circle of radius R/2, 
half that of the vessel R, and all turning the same way. Viewed grossly 
this is like a vortex sheet so the tangential velocity can jump as we 
pass from inside R/2 to outside. Then the velocity can be arranged a 
little more like the linear curve by two sections, each of which is a 
l/r curve, The gain in energy resulting from this improved distribution 
may more than compensate the energy needed to make the additional 
holes (and, further, the central vortex need not now be so large and 
energetic). 

Continuing in this way with ever more vortices it soon becomes ap- 
parent that the energy can always be reduced if more vortices form. 
However there is a limit. Due to the quantization (18) of the vortex 
strength the smallest vortex has circulation 2z/im-1. The lowest 
energy results if a large number of minimum strength vortex lines 
(which we shall call unit lines) form throughout the fluid at nearly 
uniform density. The lines are all parallel to the axis of rotation. Since 
the curl of the velocity is the circulation per unit area, and the curl 
is 2w, there will be 
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2m [2h = 2.1 x 10% lines per cm? (19) 


with w in radians per second. For w == I rad per second the lines are 
about 0.2 mm apart so that the velocity distribution is practically 
uniform. 

Such weak lines will not form actual macroscopic holes. In fact, 
if one neglects atomic structure and assumes a Classical continuous 
liquid with surface tension, a unit line makes a hole opposed by sur- 
face tension which figures out to be only 0.4 A in radians. That means 
that there is no real hole in the liquid. Around such a unit line, for 
example a straight one along the z-axis, the wave function off the axis 
is roughly 

y = (expitip,) ® (20) 
where g, is the angle about the z-axis. This does not hold close to the 
axis. On the axis exp 7g is meaningless, and close to it has enormous 
gradients. A particle on the axis cannot have angular momentum, yet 
(20) implies that each atom has angular momentum /i, nor can there 
be exceptions because the Bose statistics implies that they are equi- 
valent. Therefore a more accurate expression than (20) would be this 
expression multiplied by a factor which is unity except if any one of 
the atoms comes very close to the axis, in which case it falls rapidly 
to zero. The density of fluid falls to zero on the axis. This is the rem- 
nant of the classical hole. Actually quantum mechanically the line 
will not remain perfectly straight in one spot but will have some zero 
point motion of wandering and waving to and fro. 

It is not hard to get a reasonable estimate of the energy contained 
in these lines. First consider an isolated unit line along the axis of a 
cylinder of length L. radius 6. The velocity at radius r is #/mr and if 
0, is the fluid density in atoms per cc (9, = 1/45 A) the kinetic energy 
is the integral 

K. E. = 4fojm(h/mr)? . 2ar dr. L. 


The upper limit of the integral is 6. It diverges at the lower limit, but 
within about the atom spacing the velocity formula is meaningless. 
Furthermore, inside this radius some of the energy is potential, re- 
quired to keep to density down near the axis (that is, to make the par- 
tial ““hole’”’). Therefore the energy needed to form such a line, per unit 
length, is 
Line energy per unit length = 9,27h?m—Un(b/a) 
== 10-8/n(b/a) ergs/cm. (21) 
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Here a is a length of order of the atomic spacing. Its exact determi- 
nation would require solving the difficult quantum mechanical problem. 
In almost all applications the ratio b/a will be very large, and the 
logarithm large enough to be insensitive to the exact value of a. 
For this reason we will not attempt a detailed evaluation, but simply 
choose a to be close to the atomic spacing. We arbitrarily take a = 
4.0 A. In more complicated geometrical situations the lower limit 
will be the same, but the upper limit 6 will be some other characteristic 
dimension of the apparatus, or more usually the spacing between 
vortex lines, etc. It can be found by integrating the velocity distri- 
bution as determined for the given distriubution of singular vortex 
lines. 

For a cylinder of liquid rotating at angular velocity w = | rad/sec 
the vortices are about 0.02 cm apart. This is 0.5 x 10® times a if 
a == 4A,sowecan take the /n(b/a) in this case to be about /n(0.5 x 10°) 
or 14. Neglecting the variation of this logarithm with w we find for the 
energy of all of the lines: 

Total line energy per unit volume = 0,wh . In(b/a) 
where we have estimated /n(b/a) as 14. The ratio of this to the kinetic 
energy for a rigid body is 44m—1R-*w~Un(b/a) if the cylinder radius is R. 
For R = I cm, w = I rad/sec this ratio is 10-?. For macroscopic labo- 
ratory dimensions the excess energy to form the lines is small. They 
would form if rotating solid helium is melted by releasing the pres- 
sure, the angular velocity distribution would differ imperceptibly from 
uniformity, and the surface should appear parabolic. 

It is not self-evident that there is no state of appreciably lower 
energy, and that the energy of the rotating liquid is correctly estimated. 
This subject has not yet been analyzed any more deeply than is re- 
ported here. Therefore this part of the paper is not on as firm a foun- 
dation as the rest. We must therefore still consider it conjectural 
whether the considerations on rotational flow reported here are ac- 
tually correct. It is interesting that all the conclusions were arrived 
at independently by the author without knowledge of Onsager’s pre- 
vious work (with which they are in exact concordance). 


8. Properties of Vortex Lines 


In a situation more general than uniform rotation, in which the curl 
of the velocity is not constant, we can imagine a similar situation. We 
have a situation instantaneously with many vortex lines. Some are 
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closed on themselves in rings, and others terminate with their ends 
on the fluid boundaries. Viewed from a continuum approximation 
in which atomic structure is neglected, a velocity v, can be defined 
at every point. The curl of this is zero everywhere, except at one of 
the vortex lines where it is infinite. These lines are real quantized 
vortex lines. The circulation around a small circuit surrounding only 
one line is 2xim-}. The lines have a sense depending on the direction 
of rotation. The circulation about any curve whatosever is given by 


be, . dS = 2nhmn 


where 7 is always an integer, being the net number of lines linked by 
the circuit, account being taken to the sign of each. 

If 7 xX v, is averaged over a large enough region that many lines 
are included, the number of lines per cm? must be at least <p xX ¥,> 
m/2n% and the energy of these lines per unit volume is at least 


b1< 7 X 0,>| gi. In(b/a) (22) 


where b is the spacing between lines, 1/b? = <p x v,> m/2zhi. This 
shows that in our liquid it takes energy to create circulation. Actually 
in real, complex situations the energy might exceed greatly the value 
in (22). There may be great complex activity with many lines twisting 
and turning so that several lines of opposite senses are close together. 
In this case, the case of developed turbulence, the number of lines 
present may be bigger than the average 7 x v, would indicate. (Pro- 
bably in such a case it would be hard to define the average 7 x 1, 
because the result may depend on the size of the region over which 
the average is taken). 

The discussion of the rotating cylinder of liquid with which we in- 
troduced the lines is rather special. We shall try to give a more com- 
plete and general description of the state of the superfluid with cir- 
culation. We continue to study the case at absolute zero. Let us try 
to characterize the state of a fluid in which we desire two things (which, 
it will turn out, are mutually incompatable). We want (a) the liquid 
to be flowing with a velocity v, which is a smooth function of position 
without singularities (on a scale of distances large compared to atomic 
dimensions) and (b) we want 7 x v, not to vanish. 

Suppose the liquid in an element of volume AV (large compared to 
the atomic volume) is moving at velocity v. Then as we have seen the 
wave function should depend on the position of the atoms, if they are 
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within AV as exp (¢mv . X,R,) ©. That is, if a number of atoms in the 
region are displaced, each by AR, from one allowed (by ®) configu- 
ration to another allowed one, the main effect is that the wave function 
must change phase by 


Emv. AR, (23) 


This can also be seen in another way. If a region of fluid can be con- 
sidered to have a velocity v it has a momentum density 9,mv. It is 
characteristic of momentum in quantum mechanics, that if the center 
of mass is changed the wave function changes phase by an amount 
proportional to the momentum and to the displacement of the center 
of mass. Now if the atoms are displaced by AR, the center of mass 
moves so the phase change (23) results. This is true at least if the dis- 
placement makes no other important change in the wave function. 
We will suppose that both before and after the displacement the atoms 
are well spaced and there are no 
gross density fluctuations, etc. such 
that in case the liquid were not in 
motion both configurations would 
have essentially the same amplitude. 

The same argument goes for atoms 
in other regions, etc. so the phase 
shifts accumulate to a sum in (23) 
over displacements of atoms all over 
the liquid, if » is now considered as 

Bc feg te a function of R,. The displacements 
rig. § The wave function must 4R must be small compared to the 
tation. If all the atoms are displa~ distances over which v varies. We 
Sees chutes nee vet 2 Sous shall apply the formula in a case in 
of 25 which AR is the separation between 

atoms. 

Select, in a given configuration, a very long closed chain of atoms 
each of which is a nearest neighbor of the next in line (see Fig. 5). 
The last should have the first as nearest neighbor. The chain may 
consist of very large numbers of atoms and may even be so long that 
it passes through regions of varying velocity. Consider a displacement 
of each atom to its nearest neighbor next in line. The wave function 
cannot change, for it is simply a special permutation of the atoms. 
Yurther we will suppose that if all the displacements are made together 


375 


376 


APPLICATION OF QUANTUM MECHANICS TO LIQUID HELIUM 43 


a little at a time, each intermediate configuration is allowed. This 
sliding of the chain along itself is not prevented by potential barriers, 
especially if we allow small temporary displacement of other atoms 
adjacent to the ring to permit passage in tight places. In the final 
configuration all atoms have returned to their original positions, ex- 
cept those of the ring which have moved one over. We suppose, be- 
cause of the ease in which the displacement can be made that we can 
assume the wave function does not vanish for any intermediate po- 
sition during the displacement. Then its phase shift is given by (23), 
but this must represent no change in the wave function. It is therefore 
necessarily an integral multiple of 2x. We conclude that 


be, ds = Qndim—n (24) 


where m is an integer and the integral is taken over any path which 
goes from one atom to the next neighbor, etc. If v, is now assumed con- 
tinuous at an atomic scale, the path can be smoothed out to any con- 
tinuous curve. Of course, it is impossible that (24) holds for all con- 
tinuous paths if ” is an integer (depending on the path) if v, is free 
of singularities and continuous unless ” = 0 (in a simply connected 
region). Because any path can be deformed continuously into an in- 
finitesimal path, the left side changing continuously to zero. The 
right side cannot change continuously so it must be zero for all paths. 
Likewise for a toroidal region » must be the same for all paths which 
surround the hole. 

We see therefore that v, cannot be continuous if we are to have 
circulation. There must be places where 1, is discontinuous, and places 
in the fluid where a displacement of an atom to its neighbor may not 
be possible without passing through a node in wave function. In 
the neighborhood of such a node the probability of finding an atom 
is reduced. This decrease in density requires energy to maintain it. 
We shall therefore try to arrange conditions so that such places are 
as infrequent as possible. Under those conditions, for nearly every 
conceivable ring of atoms the atoms can be moved over to the next 
adjacent atom without the wave function vanishing. Its phase change 
must be a multiple of 2z. If two adjacent rings have a phase change 
which is different, differing by 2m say, then between them somewhere 
must lie a very small ring of three or four atoms for which the circu- 
lation is 2ah/m. For example, suppose for a certain ring A the phase 
is zero, but for a nearby ring B it is 2x. Then shift ring B by a few 
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atoms at a time until it gets as close to ring 4 as possible, but still 
has phase shift 27. Likewise shift 4 until it is as close to B as possible 
but so that it has still shift 0. Then A and B will contain many atoms 
in common and only differ by a few, as illustrated in Fig. 6. Then 


b 4 D 
1 Pa 
a 


Fig. 6. A displacement along ring B followed by a reverse displacement along 

an adjacent ring A with many atoms in common is equivalent to a displacement 

around the ring C, indicated by black circles (except for an inconsequential 
permutation 5—b). 


consider a permutation consisting of shifting B forward, then shifting 
A backward. It is readily verified that this change is the same as a 
shift of atoms around the very small ring C consisting of those parts 
of A and B which are not common, plus one of the common atoms. 
But the change in phase is 2x when B shifts and 0 when A shifts back, 
so that it must be 2m for the very small ring C. * This represents a 
highly concentrated angular momentum. Somewhere in the middle 
of ring C is a nodal point. It is readily appreciated geometrically that 
these nodal points must essentially form lines through the fluid. They 
are quantized vortex lines. It must be admitted that this argument is 
far from complete. We should consider states in which the location 
of the vortex line is uncertain, that is, a superposition of states with 
various locations for the line. Such a state would have a lower energy. 
Possibly we make a serious error in imagining that the velocity can 
be defined right up to atomic distance from the axes, if this axis itself 
does not have a definite location. Onsager has remarked, in private 
communication, on the possibility that these quantum effects might 
lower the energy to such an extent that the logarithm in (24) should 


* The change in phase cannot be determined only from the initial and final 
configuration, but requires a description of the amplitude for intermediate con- 
figurations as well. Therefore this argument is not complete unless it is also 
assumed that partial rotations of C consisting of displacements of less than one 
atom spacing can also be roughly imitated by partial displacements of B and A. 


377 


378 


APPLICATION OF QUANTUM MECHANICS TO LIQUID HELIUM 45 


be absent. At any rate, although our energy estimates may be incorrect, 
quantized vortex lines probably exist. We continue our discussion 
of the consequences of this assumption. 

On a large scale according to the theorem of Helmholtz, vorticity 
moves with the fluid in such a way that the strength of a vortex fila- 
ment remains constant. This means that if the fluid drifts the lines 
drift with it, maintaining their quantized strength. This is true, at 
least, if no forces act directly on the vortex line. In general the force 
per unit length on a vortex line equals the density, 9,m, times the 
vector cross product of the circulation, 2x/im~!, and the velocity of 
fluid where the vortex is. 


9. Critical Velocity and Flow Resistance 


We next turn to the role such vortex lines may play in the resistance 
to flow found at sufficiently high velocities. 
We have suggested that this resistance 
cannot be understood in terms of a direct 
creation of rotons, the superfluid other- 
wise being in perfect flow. Let us consider 
what would happen if liquid is flowing out 
of an orifice, or tube, into a reservoir of 
fluid at rest. In Fig. 7 is illustrated the 
distribution of flow for irrotational motion. 
A very high velocity develops near the 
corners and large accelerations develop 
there. An ordinary fluid, such as water, 
flows in a complicated manner such as 
illustrated in Fig. 8 (a few moments after flow starts). The water 
shoots out straight into the nearly still fluid in the reservoir, forming 
a vortex sheet, which is unstable and curls around, eventually in an 
extremely complex manner. Let us see how helium might try to imi- 
tate some of the features of the type of flow illustrated in Fig. 8. 
Just for rough orientation and estimate suppose the fluid tries to go 
out in a jet, let us say at first of the same width and velocity as in 
the tube. Take the case that the tube is a long slot perpendicular to 
the paper, and the flow is roughly two dimensional. 

Then circulation is implied for the velocity is v in the jet and 0 
outside. This requires the formation of vortex lines, perhaps as illu- 
strated in Fig. 9. The spacing is x and if this is small compared to d, 


a 


Fig. 7. Ideal potential flow 
from an orfice 
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the slot width, the velocity distribution is roughly uniform inside the 
the jet. Taking a line integral along the jet for unit distance, and re- 
turning outside the jet, the circulation is v so the number of lines per 
centimeter is 


1 
— == v/2xhm". 

x 
It takes energy to form these lines. If 
there is not enough kinetic energy in the 
fluid to supply the energy to make the 
—<F lines, no resistance will appear. Once the 

—_—_—_> 


lines can be formed they are, in a manner 

~~, we Shall soon discuss, ultimately dissipa- 

ted as heat and a resistance appears. Let 

| C us see what order of critical velocity we 

would estimate in this way. The lines move 

Fig. 8. Real flow from anor- Ut at the velocity of the fluid at their 
tice for ordinary liquids, pro- own location, which is v/2. Another way 
HUcing 30 aes vortex to see the necessity for this is to realize 
that as the fluid passes from inside to out- 

side the pipe vorticity is created, so new lines must continually come 
rolling out of the ends of the orifice. In our case-v/x lines are created 
per second. The energy needed to create these is (per unit length of slot) 


“2 
Dnied Och?m Un(d/a) 
where the argument in the loga- e 
vhere the arg n the logas 7” —— ) ®) ® 
rithm is only approximate. The —_ 
total kinetic energy available per Y V—> al 


cc of fluid is mee so that per | > 2 » 
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second vd is available. If 


Fig. 9. Idealization of supposed vor- 
tex rings formed when superfluid heli- 


we define v, as that velocity for um issues at high speed from an orfice. 


which the energy available is just 
large enough to create the vortices we find 


v,d = hm-Un(d/a). 


For example, for a slit of width d = 10-5 cm, (which is about three 
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times the width of a Rollin film at a height of 1 cm) this gives v, = 
100 cm/sec, if Jn(d/a) is taken as 6. This is somewhat higher than the 
critical velocities observed. The calculation is only meant as an esti- 
mate because the actual situation must be complicated. For one thing, 
near the critical condition x comes out about 3d so our picture of a 
uniform jet is poor. Further, the velocity in the jet must of course 
be reduced as a result of the energy needed to form the vortex line. 
Actually probably the situation near the critical point must be very 
complicated and irregular. The flow for short momentary periods may 
be much like Fig. 7 but irregularly vortex lines peel off of the edges 
of the slit, probably starting at one point along the slit and progressing 
to other places, or perhaps if the hole is circular, one or two vortex 
lines is fed out continuously in a form roughly like a helix. It is pre- 
dicted that very close to the critical velocity when loss just begins, 
the resistance will be irregular and show fluctuations. These fluctua- 
tions are very small however and would be hard to detect. Possibly 
some sound may be generated by the flow irregularities. It is difficult 
to estimate its intensity. When helium is driven, just above critical 
velocity, through an emery powder superleak, some noise should be 
generated as the various vortex lines suddenly form and pass into the 
stream. The irregularities are a result of the unpredictable quantum 
transitions between states of no vortex line and one with a section 
of line. 

Another possible source of vortex lines is the contact between 
flowing liquid and the walls. It is not necessary that all the loss occurs 
at the exit end of the tube. The walls of the pipe are irregular. Vortex 
lines may be created inside the pipe also. 

It is difficult to go beyond this order of magnitude calculation in 
describing the conditions controlling the production of vortex lines. 
For example, if one studies the example given there are serious diffi- 
culties. As a particular vortex line leaves the end of the tube there are 
very great forces trying to pull it back resulting from its image in the 
tube wall. Let us imagine a line a distance b above the wall in a tube 
in which the velocity of flow is v,. It is readily shown that the 
forces acting on the line are these. First a force pulling away from the 
surface of strength 2zfo,v,. Second, from the image, an attraction to 
the wall of strength zf?9,/mb. A vortex line responds to forces by 
moving through the liquid to reduce the net force to zero. In this case 
it would drift upstream if the attraction is highest. But a vortex line 
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will interact with the wall, especially at its ends which go into the wall 
surface. Suppose this results in a frictional force which keeps the line 
from moving upstream. Then the response is to move closer to the 
wall. The vortex only moves away from the wall if 2ho,v exceeds 
xho,/mb. Even if v is 100 cm/sec this requires 6 to exceed 10-§ cm 
or 20 atomic spacings. We might expect a vortex line to fluctuate 
away from the surface by a few atomic diameters. But how can we 
expect to penetrate the enormous potential barrier, to create a line 
so far away from the surface that the flow velocity can pull it further 
out and create eventual vorticity and energy loss? 

More likely a line gets started somehow and has its ends tied on the 
wall. Then the forces of the fluid on the rest of the line cause it to 
wander about in such a way that more and more vortex line is fed 
out. It is not necessary to create bodily at one instant a complete 
section of line. For example, for the case of liquid issuing from a tube 
perhaps the vortex lines are helices with contact points at the edge 
of the hole which turn round and round while the helix moves out- 
ward. Similar things could happen inside tubes. If the tubes are very 
narrow the line will hit the other surface easily and be attracted by 
the walls. It can never get very far from a wall. Even if started some- 
how it will fall back into the tube walls unless the velocity v, suf- 
fices to keep it in the stream. Therefore the smallest tubes have 
the highest critical velocities. 


10. Turbulence 


The patterns of vortex lines which we have studied are well known 
to be unstable. In the case of the rotating cylinder this is not true if 
the cylindrical vessel containing the helium rotates also. But if the 
container is stopped the situation is altered. There are forces between 
the wall and vortex lines. (This is because the fluid density is altered 
near the line axis, so the interaction with the wall is not the same as 
the average for the rest of the helium). The lines at the outside drag 
past the stationary wall and as a result get distorted from their ori- 
ginal vertical line position. This twists others, etc. Lines fall into the 
wall and others twist about each other in a complex way. It would 
be interesting to study this experimentally, to see how fast, and in 
what manner, the liquid eventually slows down. 

In ordinary fluids flowing rapidly and with very low viscosity the 
phenomena of turbulence sets in. A motion involving vorticity is 
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unstable. The vortex lines twist about in an ever more complex fash- 
ion, increasing their length at the expense of the kinetic energy of 
the main stream. That is, if a liquid is flowing at a uniform velocity 
and a vortex line is started somewhere upstream, this line is twisted 
into a long complex tangle further down stream. To the uniform velo- 
city is added a complex irregular velocity field. The energy for this 
is supplied by pressure head. 

We may imagine that similar things happen in the helium. Except 
for distances of a few angstroms from the core of the vortex, the 
laws obeyed are those of classical hydrodynamics. A single line playing 
out from points in the wall upstream (both ends of the line terminate 
on the wall, of course) can soon fill the tube with a tangle of line. The 
energy needed to form the extra length of line is supplied by a pressure 
head. (The force that the pressure head exerts on the lines acts even- 
tually on the walls through the interaction of the lines with the walls). 
The resistance to flow somewhat above critical velocity must be the 
analogue in superfluid helium of turbulence, and a close analogue at 
that. 

There are some ways, however, in which the two cases differ. In 
a classical fluid there is a thin boundary layer near the wall of the pipe 
in which viscosity controls the situation. In this boundary layer there 
is a large vorticity, but it escapes into the stream to be amplified, only 
from the edge of the layer. Inside it is damped by viscosity. As the 
stream velocity falls the boundary layer thickens, for the amplification 
is less and the damping overpowers it ever further from the wall. 
Below a critical velocity the turbulence ceases altogether and the 
flow is laminar, but with vorticity, the viscosity keeping the vorticity 
from amplifying itself. That is, viscosity is the mechanism which deter- 
mines whether vorticity will be amplified or not, and therefore whether 
turbulence is produced. If the viscosity goes to zero as a limit (and no 
other physical phenomena are added) a classical ideal liquid would 
exhibit turbulence at any velocity, no matter how small. 

Superfluid heliurn is an ideal fluid of zero viscosity. It does not ex- 
hibit turbulence at low velocity because of another, quantum mechani- 
cal, effect. The vorticity is quantized and cannot begin at as low amount 
as desired. One must supply energy enough to get the first one or two 
vortex lines started before the amplification process of turbulence can 
take over. There will not be a boundary layer with a structure ana- 


logous to that in classical flow (although near the walls the flow will 
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be somewhat different because of the dragging forces between the 
moving vortex lines and the wall). 

In a classical fluid ,if the turbulent stream empties into a reservoir, 
the turbulent motion continues for a while, but as a result of the vis- 
cosity, it gradually slows up and dies out, the energy appearing even- 
tually as heat. 

What happens to a turbulent mass of superfluid left to itself? If 
there is normal fluid present the rotons and phonons wil! collide with 
the vortex lines and take energy from them, gradually turning this 
energy gain into more rotons and phonons (as a result of collisions among 
rotons the number of these may change). But an interesting question 
arises if the experiment is imagined at absolute zero. What can even- 
tually become of the kinetic energy of the vortex lines? 


Fig. 10, A vortex ring (a) can break up into smaller rings if the transition be- 
tween states (b) and (c) is allowed when the separation of vortex lines becomes 
of atomic dimensions, The eventual small rings (d) may be identical to rotons. 


One possibility that suggests itself is this. Consider a large distorted 
ring vortex (Fig.10a). If, in a place, two oppositely directed sections 
of line approach closely, the situation is unstable, and the lines twist 
about each other in a complicated fashion, eventually coming very 
close; in places, nearly within an atomic spacing. Consider two such 
lines (Fig. 10b). With a small rearrangement, the lines (which are under 
tension) may snap together and join connections a new way to form 
two loops (Fig. 10c). Energy released this way goes into further twis- 
ting and winding of the new loops. This continues until the single loop 
has become chopped into a very large number of small loops (Fig. 10d). 
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The smallest ring vortex that can exist must have a radius about half 
the atomic spacing. Let us guess that this is in fact a roton. Then all 
the energy of the vortex will eventually end by forming large numbers 
of rotons, that is, heat. Perhaps eventually it will be easier to under- 
stand the details of the complete transformation of organized flow 


energy into disorganized heat energy for liquid helium than for other 
substances. 


11. Rotons as Ring Vortices 


It is not unreasonable to guess that these smallest vortices are ro- 
tons. The velocity distribution around a roton, which is found by ana- 
lytic means (ref. 8) is similar to that around a vortex ring. It is quite 
reasonable that a vortex ring can be only so small. To increase the 
curvature of a vortex line beyond that of radius roughly a may take 
energy. Let us imagine a roton to be the circular quantized vortex 


of lowest energy. A large circular vortex has (from (21)) energy 
ahi 

Enh on Rja. It carries momentum p = 2xR?. 2aiig,. This 
m 


momentum is that of a roton, #,, if R = 2.2 A. The energy is the 
right order (it corresponds to replacing J by 1.6). 

One might object that such a vortex drifts through the fluid, at 
velocity v = (%/2mR)In R/a, so one would expect rotons not to have 
a zero group velocity. Actually this drift, of a large vortex, has its 
seat in the force tending to shrink the vortex to decrease the energy 
of the line. The response to the radially directed force is a perpendicular 
motion. It is analagous to the ornery response of a gyroscope. In fact, 
if a vortex line were a thin flexible mechanical tube with inertia, and 
were started with zero forward motion, it would first fall in a bit and 
then move forward in a halting fashion, like the nutation of a gyros- 
copic, or the motion of an electron in crossed magnetic and electric 
fields. In a roton we imagine that the forces tending to contract the 
ring are already opposed by a kind of stiffness of the ring. It is already 
as small as possible. No drift motion results. In fact forward drift 
would expand it and raise the energy, while reverse drift would try 
to compress it to smaller size, again raising the energy. The lowest 
energy is at zero drift velocity. We may notice in passing that they 
can only drft in a direction perpendicular to their plane, that is, 
along, or opposite, the direction of the momentum. This agrees with 
a property derived for rotons from their energy-momentum relation 
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(1), that the group velocity dE ()/6 is in the direction of the momen- 
tum, (or opposite). 

Having travelled so far making one unverified conjecture upon 
another we may have strayed very far from the truth. However im- 
prudent it may be, there is one further observation we would like to 
make, A detailed picture is not available which describes physically 
just what goes on as the transition is approached from below. The free 
energy expression arising from (6) does not of itself describe the tran- 
sition. The transition occurs when the number of rotons is very large. 
Some sort of interaction may occur between them, or there may be 
some limitation to the degrees of freedom. !8 There is no doubt that 
it is the analogue of the transition in the ideal gas, but it would be 
nice if we could get a-less mathematical and formal description of 
the events. Of the following I am not sure, but it does seem to be an 
interesting possibility. 

If rotons are the smallest ring vortices, and those of lowest energy, 
A, then there are states of higher energy corresponding to larger rings. 
For example, a ring of twice the diameter may have twice the energy 
more or less. The relative number of these will be expected to be very 
low, however. Since JA is 9.6°K, at the transition exp (— A/RT) is 
10-2, so very few larger vortices will be expected in equilibrium. Cer- 
tainly none whose length is 10? or 103 atoms! This neglects an impor- 
tant feature, however. For a long line there are an enormous number 
of shapes and orientations available. Such a line is not infinitely flexi- 
ble, of course, for the curvature cannot well exceed a}. It may be 
likened to a chain of a finite number of links. Adding one link requires 
an energy é, say of order A, but increases the number of orientations 
by some factor, asymptotically, say s. In equilibrium then, the number 
of chains of + 1} links is a factor s exp (— ¢/k7) times the number 
with links. For low temperatures this is less than unity. No long 
chains are important. The excitations consist of rotons and a few other 
rings of slightly larger size. As the temperature rises, however, there 
comes a time when the factor s exp (— «/kT) exceeds unity. Then 
suddenly the rings of very largest length are of importance. The state 
with one vortex line (or a very few) which winds and winds throughout 
the liquid like a near approximation to a Jordon curve, is no longer 
of negligible weight. The superfluid is pierced through and through 
with vortex line. We are describing the disorder of Helium I. At 
first the curve doesn’t make full use of all of its orientations and higher 
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entropy. But as the temperature rises a little more it squeezes into 
the last corners and pockets of superfluid until it has no more degrees 
of flexibility available. The specific heat curve drops off from the 
transition to a smooth curve and the memory of the possiblity the 
helium can exhibit quantum properties in a unique way is lost in the 
perfusion of states and in disorder, as it is for more usual liquids. 
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A wave function previously used to represent an excitation (phonon or roton) in liquid helium, inserted 
into a variational principle for the energy, gave an energy-momentum curve having the qualitative shape 
suggested by Landau; but the value computed for the minimum energy A of a roton was 19.1°K, while 
thermodynamic data require A=9.6°K. A new wave function is proposed here. The new value computed for 
A is 11.5°K. Qualitatively, the wave function suggests that the roton is a kind of quantum-mechanical 
analog of a microscopic vortex ring, of diameter about equal to the atomic spacing. A forward motion of 
single atoms through the center of the ring is accompanied by a dipole distribution of returning flow far 


from the ring. 


In the computation both the two-atom and three-atom correlation functions appear. The former is known 
from x-rays, while for the latter the Kirkwood approximation of a product of three two-atom correlation 
functions is used. A method is developed to estimate and correct for most of the error caused by this 
approximation, so that the residual uncertainty due to this source is negligible. 


1, INTRODUCTION 


IQUID helium undergoes a thermodynamic transi- 

tion at 2.19°K. Below this temperature, many of 
the properties of the liquid are explained by Tisza’s 
phenomonological two-fluid model. Landau realized 
that the macroscopic properties of the liquid would 
resemble those of a mixture of two fluids, provided that 
a certain form is assumed for the energy-momentum 
curve of the elementary excitations in the liquid. 
Starting from first principles, one of the authors has 
recently computed an energy-momentum curve which 
is based on certain ideas about the nature of the wave 
functions representing the excitations.! The shape of 
the curve is in qualitative agreement with Landau’s, 
but some serious quantitative discrepancies exist. The 
ideas of III are pursued further here, and a more com- 
plicated wave function is constructed to represent an 
excitation. The energy-momentum curve computed 
with this wave function will prove to be in better 
agreement with Landau’s. In addition to the actua] 


* This paper is based on a Ph.D. dissertation submitted to 
California Institute of Technology in November, 1955. A pre- 
liminary report of this work has appeared [R. P. Feynman and 
Michael Cohen, Progr. Theoret. Phys. Japan 14, 261 (1955)]. 

1R. P, Feynman, Phys. Rev. 94, 262 (1954), henceforth referred 
to as III, 


computations, we discuss some approximate methods 
which may be useful in other work of this sort. 


2. LANDAU’S SPECTRUM 


The energy momentum curve proposed by Landau? 
rises linearly for small 4, passes through a maximum, 
falls to a minimum, and rises steeply for large p. (See 
Fig. 6.) The excitations in the linear region are quantized 
sound waves (phonons) ; their energy, measured relative 
to the ground-state energy, is 


E(p)=cp, (1) 


where ¢ is the velocity of sound (240 m/sec). Near its 
minimum, the spectrum can be approximated by a 
parabola, 


E(p)=4+ (p— o)?/2u. (2) 


Landau believed that excitations in this region represent 
some kind of rotation of the fluid, and called them 
“rotons.” In the present paper we are led to the picture 
of a roton as the closest quantum-mechanical analog 
of a smoke ring. The remaining portions of the spectrum 
are not excited at low temperatures. For T<2°K the 
phonons and rotons are present in sufficiently small 


1L. Landau, J. Phys. era 5,71 (1941). 
+L. Landau, J. Phys. (U.S.S.R.) 11, 91 (1947). 
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LIQUID STRUCTURE FACTOR S(k) 
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Fic. 1, The liquid structure factor S(&), based on the x-ray 
scattering data of Reekie and Hutchison. The principal maximum 
corresponds to a wavelength equal to the nearest neighbor 
distance in helium. Appendix A describes modifications we have 
made in the data. 


numbers to allow them to be treated for thermodynamic 
purposes as noninteracting. The thermodynamic func- 
tions can then be computed ; Landau fitted the available 
(1947) data on specific heat and second-sound velocity 
with the values 


A/k=9.6°K, po/h=1.95 A“, =0.77 mae. (3) 


More recent measurements‘ of the velocity of second 
sound down to T=0.015°K suggest the values 


A/x=9.6°K, po/h=2.30 A“, (4) 


although the values (3) also fit the data quite well. 
The value of A/x is quite well determined® by the ther- 
modynamic data, since it enters formulas in the form 
exp(—A/xT). The differences between (3) and (4) are 
probably a fair measure of the uncertainty in our 
knowledge of po and xz. 

The reasoning which led Landau to the general form 
of the spectrum, and his method of deducing the two- 
fluid picture from the spectrum, will not be reviewed 
here. He did not attempt to compute the values of A, 
fo, and yw from first principles. 


u=0.40 mue, 


3. A SIMPLE WAVE FUNCTION FOR THE 
EXCITATIONS 


In III a wave function of the form y= ¢ > f(r,) is 
proposed to represent an excitation. The physical 
reasons for this wave function will not be reviewed here. 
The sum runs over all the atoms in the liquid, and ¢ is 
the wave function for the liquid in its ground state. 
The requirement that y be an eigenfunction of the total 
momentum operator® P=—ih >° V; corresponding to 


*deKlerk, Hudson, and Pellam, Phys. Rev. 93, 28 (1954). 

5Dr. J. R. Pellam (private communication) estimates the 
uncertainty in A/x to be less than 0.2°. 

5 If the liquid were confined to a box of side L, with fixed walls, 
then the walls could absorb momentum and the energy eigenstates 
would not be momentum eigenstates. Instead, we control the 
density by requiring the wave function to be periodic in all 
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the eigenvalue 7k implies that f(r)=e?*-', and thus 


¥= oD exp(tk-r,). (5) 
Substitution of (5) into the variational principle 
els E=6&/9, (6) 
$= if VHya'r (7) 
and 
s= [v'va, (8 


gives an upper limit’ for the energy of the lowest excita- 
tion of momentum 7k. The result is 


E(k) =?#h/2mS(k), (9) 


where S(k) is the Fourier transform of the zero-tem- 
perature two-atom correlation function P(r), 


S(k)= f etk-p(r)dr. (10) 


The data which we have used for S(&) are given in Fig. 1 
and are essentially those obtained from x-ray diffraction 
by Reekie and Hutchison.*® Figure 2 is the corre- 
sponding curve for p(r). S(&) exhibits a sharp maximum 
near k= 2 A, which corresponds to a wavelength equal 
to the nearest neighbor distance in the liquid. Accord- 
ingly, the spectrum (9) exhibits a minimum at approx- 
mately the correct wave number (see curve B of Fig. 6). 
It is shown in III that the wave function (5) is exact 
for phonons (small &) and that the limiting form of (9) 
is E(k) =hck. The occurrence of a minimum at k=2 AT 
is in qualitative agreement with Landau’s predictions, 
but the value of A/x computed from (9) is 19.1°K, 
which is twice the value given by experiment. 


4. ARGUMENTS FOR A NEW WAVE FUNCTION 


The excitation (5) can be localized in a definite 
region by the formation of a wave packet. If 4(r) is a 
function, like a Gaussian, which is peaked about some 


variables with period Z, With this boundary condition, P 
commutes with H and the energy eigenstates can also be taken 
as momentum eigenstates. 

7 Kigenfunctions of P belonging to different eigenvalues #k 
are orthogonal. Hence, for different k, the trial functions (5) are 
orthogonal to each other and also to the true wave functions whic! 
minimize (6). The entire spectrum E(é) therefore lies above the 
true spectrum, In footnote 3 of III it is mentioned that the wave 
function gexp(@iN—“k- Zr;), which represents translational 
motion of the whole liquid, has momentum #k and energy 
##k?/2mN, which is certainly lower than any energy we shall 
compute from (5). The periodic boundary condition, however, 
tules out such states unless & is as large as Ni. 

8 J. Reekie and T. S. Hutchison, Phys. Rev. 92, 827 (1953). 
Their paper contains a curve for 72p(r), but does not include their 
data on S(k). We are indebted to Dr. Reekie for sending us the 
data, which are now generally available in reference 9. Appendix 
A contains a discussion of some changes which we have made in 
the data. 

®L. Goldstein and J. Reekie, Phys. Rev. 98, 857 (1955). 
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jocation in the liquid and falls off smoothly in a distance 
large compared with 2x/% but small compared with the 
size of the box, then the wave function 


w=> A(r,) exp(ik-rijg (11) 


represents a localized excitation. The packet will spread 
in time, and will, drift with velocity #—!V,£(k). The 
current and density associated with (11) were computed 
in III. The number density is very close to the aver- 
age density po, even in the region of the packet, and 
the current at a point a is j(a)=%m—\|h(a)|?. The 
wave function (11) therefore leads to the picture of a 
total current #km— (assume / |4(a)|%da=1) distribu- 
ted over a small region and having everywhere the 
same direction, with no appreciable change in the 
number density anywhere. Such a picture clearly can- 
not represent anything like a stationary state, since 
in a Stationary state the current is divergence-free and 
there would necessarily be a return flow directed oppo- 
sitely to k. 

One way to incorporate such a backflow into (11) 
is to multiply the wave function by exp[7 ¥ g(r.) J, 
obtaining 


w= e expli X g(ri)]X h(rs) exp(ik-rs). (12) 


Application of the velocity operator —if#im—'V; shows 
that, in addition to whatever velocity it had in (11), 
the ith atom now has an extra velocity #m~'Vg(r,). 
Substitution of (12) into (6) shows that the energy is 
minimized if g(r) satisfies 


V-G+hpom'Vg) =0, (13) 


where j is the current computed from the old wave 
function (11). Furthermore, the current arising from 
(12) is J=j+hpym'Vg, so that (13) states that the 
best backflow g is that which conserves current. Equa- 
tion (13), with the physically reasonable boundary 
condition that g—0 as ro, completely determines g. 
At large distances g has the form of the velocity poten- 


0,5 


RADIAL DISTRIBUTION FUNCTION Pit, 


30 40 5.0 6D 
+ (ANGSTROMS } 


Fic. 2, The radial distribution function p(7), based on the data of 
Reekie and Hutchison. 
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tial for dipole flow, namely u-r/r*; the dipole moment is 


y=m(drhipy) f alv-j(@) Ida 


= —m(drhps)— f j(a)da= — (4mpy)k. = (14) 


The negative sign of w indicates that the direction of 
the backflow is opposite to that of k, as expected. We 
shall refer to the value of uw given by (14) as the 
“classical value,” since it is derivable from the equation 
of conservation of current plus the assumption that the 
momentum density is equal to the current density 
times the mass. The energy of (12) is only slightly 
lower than that of (11), the difference being of the 
order of the reciprocal of the volume of the packet. The 
important point to be learned from this calculation is 
that the energy is lowered if the wave function con- 
serves current. 

The solution of a somewhat different problem tends 
to support the same idea. Suppose we want to find the 
energy of a state in which a foreign atom moves through 
the liquid with momentum #k. The foreign atom is 
assumed to have the same mass as He atoms, and also 
to experience the same forces, but it is not subject to 
Bose statistics. The energy of this situation was com- 
puted in III. The simplest trial wave function is 


w= ¢ exp(ik- ra); (15) 


r, is the coordinate of the foreign atom, and ¢ 3s the 
wave function for the ground state of the entire system 
(which is the same as if all the atoms obeyed Bose 
statistics). With this wave function, Eq. (6) gives 
E=ik?/2m. A possible way of lowering the energy 
would be to let the neighbors of the moving atom 
execute some pattern of flow around it, leaving space 
in front of it and filling in the hole behind it. Some such 
pattern is already contained in (15), since the ground- 
state wave function ¢ prohibits atoms from overlapping. 
But in the ground state, readjustments are made by 
pushing a few immediate neighbors of the foreign atom 
out of the way; these neighbors are crowded into less 
than their usual volumes, causing (15) to have a high 
kinetic energy. If, instead, room could be made for the 
moving atom by the simultaneous motion of many 
atoms, each being crowded only slightly, the kinetic 
energy of the state would be lower. In fact, there is no 
reason why the crowding cannot be eliminated entirely, 
since the amount of matter in the system remains 
constant. Roughly speaking, the requirement of no 
crowding means that the current is divergence-free, 
and the no-crowding argument shows physically why it 
is energetically advantageous to conserve current. The 
argument is vague, however, and the exact form of the 
backflow will be determined by more accurate methods. 

A wave function of momentum #k which includes a 
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pattern of backflow around the foreign atom is 


w= y exp(tk-ra) exp[i zy g(ti—ra) ]- (16) 


When (16) is substituted into the expression (6) for 
the energy, minimization of E leads to a differential 
equation which determines g(r). The solution at large 
r is proportional to k-r/r*. Accurate numerical solution 
is simple, but uncertain because of uncertainty in the 
values of p(r); since (6) is a variational principle, we 
may take g(r)=Ak-r/r*. Substitution of (16) into (6) 
gives 


E= (2m)-W?RT1+I,A+ Uat+Isa)A?], (17) 


where J; and /4 are integrals defined by Eq. (25) and 
Isa is an integral defined by (57). The integrals are 
evaluated further on; only the answer interests us here. 
Equation (17) becomes 


E= (2m)— 2k? (1+0.1864 +0.02464?), (18) 


with A measured in A*. The energy is minimum when 

= —3,8 A’; the “classical” value predicted by (14) 
is A=— (4rpo)-'= —3.6 A®. The close agreement of 
the two values seems to indicate that the reduction in 
energy is due to the physical effects we have mentioned, 
and is not simply the result of allowing an extra degree 
of freedom in the wave function. The improved value 
for the energy is!° 


E=0.648#72?/2m, (19) 


Since the wave function (5) for a phonon or roton is 
just what would result for symmetrizing (15), one 
might hope to lower the energy of (5) by adding terms 
to represent a backflow around each moving atom, The 
resulting wave function would be the symmetrization 
of (16), i.e., 


y= > exp(ik-r,) expt > g(r;4) J. (20) 


For large k, when this wave function is substituted into 
the energy and normalization integrals, there is little 
interference between terms with different i; the energy 
is therefore given by (19) and is a definite improvement 
over (9). For small &, (20) cannot lead to a lower energy 
than (5), because (5) is exact for phonons. At inter- 
mediate k, one might thus expect to lower the energy 
by a factor between 1.00 and 0.65. In fact, we do 
better than this, 

The attempt to find the function g(r) which gives 
the lowest energy when (20) is substituted into (6) 


This is somewhat higher than the value obtained in III, 
where a rather inaccurate approximation was used for 74. With 
the new value for 74 we find that the effective mass of a He? atom 
moving through He‘ is 5.0 atomic mass units, instead of 5.8, In 
the calculation it is assumed that the distribution of atoms around 
the He’ atom is the same as that around an He‘ atom. The higher 
zero-point motion of the lighter atom actually pushes its neighbors 
further away. This effect will increase the mass, but probably by 
only a smal! fraction of a mass unit. 
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leads to an intractable equation. We therefore take 
g(r)=Ak-r/r, where A will be chosen to minimize the 
energy. The difficulty of handling integrals which 
involve e*” leads one to consider the possibility of 
replacing exp(z > g) by 1+7 D> g. The average value of 
Ds ets) is So p(r)g(r)dr, which is zero because 
g(r) is an odd function. The mean square value of 
~~ g(ty) is PAZs, where the integral J; is defined 
and evaluated further on. With the classical value for 
A (which is close to the optimal value throughout the 
interesting range of &) the root-mean-square value of 
> g(r) turns out to be 0.252, where & is measured in 
inverse angstroms. Even with k=2 A7, replacement of 
exp(i >- g) by 147 > g is not unreasonable," and we 
shall work with the wave function 


v=o X exp(tk-r)(1+7 2X g(r) ], 


it 


(21) 


where g(r)=Ak-r/r’. This wave function is still an 
eigenfunction of the total momentum operator P, with 
eigenvalue wk. 

The roton state represented by the function (21) can 
be described roughly classically as a vortex ring of such 
small radius that only one atom can pass through the 
center. Outside the ring there is a slow drift of atoms 
returning for another passage through the ring. There 
are at least three ways that the classical picture is 
modified. (1) The momentum of atoms passing through 
the center cannot be made smaller because the wave 
function must return to its original value when, after 
one moves through, another stands in its old place. 
The wavelength must be the atomic spacing. (2) The 
ring does not drift forward as a large smoke ring. 
because as it is as small as possible there is no force 
tending to shrink it; such a force in a classical ring is 
balanced as a consequence of the forward drift, (3) The 
location of the ring is not definable. In typical quantum- 
mechanical fashion the lowest energy state corresponds 
to superposition of amplitude to find the ring anywhere 
in the liquid. The energy is less than the kinetic energy 
WR,?/2m of one atom with momentum ky because 
there is a correlated motion of many atoms moving 
together so the effective inertia is higher (the energy 
A/« corresponds to 2.5 atoms moving together at total 
momentum 7k»). 


1 Of course, since a trial function is a free choice, it would be 
mathematically legitimate to insert 1+72 g into the variational! 
principle (6) even if 2g were not small, but there would be 
little physical reason to expect a good answer. If exp{i 2 g(ri—r4)] 
is replaced by 1+42Z g(1;—1) in the foreign atom problem, the 
resulting integrals are among the ones defined and evaluated 
further on. The energy is given by 


WR? 140,186A + (0.0217 +0.0049%2) A? 


2m 1+0.00492?4? 


When &=2 A“, the fraction has the minimum value 0,689, which 
is 6% higher than the value given by (19). The associated 
value of A is —3.4 A3, When =2.5 A~, the fraction is 0,716, cor- 
responding to A = —3.1 A’. We conclude that for k<2 A7, replace- 
ment of exp(i Z g) by 1422 g does not seriously raise the energy. 
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5, COMPUTATIONS WITH THE NEW WAVE 
FUNCTION 


(a) Definitions 


If the wave function for an excited state is y=Fy, 
it is easily shown (see ITT) that 


$= f PHyd' r= (W?/2m) Lo f ViF-V.F*y*d¥r. (22) 


‘The only memory of the potentials is in the ground- 
state wave function gy; the information which we need 
about ¢ will be taken from experiment, since our main 
interest here is to test some ideas about the nature of 
excited states and not to develop a detailed theory of 
the ground state.!? Substitution of (21) into (22) and 
(8) gives E= &/9 where 


2m6/N¥=PL1+A Lit+-)) 


+A2(RT +04 tl 5+ kIg+J7) 1, (23) 
g/N=Ig+AkIo +A? 10, (24) 
an 
f= —2k(b f Veu(endpe(t,2)dra, 
[2=2 (oot f exp (tk: rz) k-Vgi(ti2)p2(1,2)drai, 
Topi f gulru)gu(eadoa(l 2,3)aradray 
Temps { Vgs(te) Ves (tw)p:(1,2,3)drasdra, 
Too f exp (iK- res) Vgi (rie) 
+ Vgi(ris)p3(1,2,3)dreidrsi, (25) 
To 2iloby [exp (ék-ris)gu (tak 
-Vgi(ris)ps(1,2,3)draidrsi, 
y= — 2p f exp (tk: 13) Vgi(re1) 
* Vegi (tis) es(1,2,3)dreidrai, 
l= fe <p@)ar—S@) (See Eq. (10) ], 
I= — Bip f exp (ik. Ti2)g1 (132) p3(1,2,3) drosdr33, 
lomo f exp (¢k- 112) g1 (131) g1 (t42) 
Xpa(1,2,3,4) dreidraidras. (25) 


"RM. Mazo and J. G. Kirkwood [Proc. Natl. Acad. Sci. 
41, 204 (1955)] have computed p(r) theoretically by solving an 
approximate integral equation. 
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We have written g(r)=Akgi(r). The mean density of 
atoms is pp= N/V. The probability in the ground state 
that atoms are located at r; and re is pe(r),re)dridre. 
Except in the negligible region near the surface of the 
liquid, we have pe(r,re) = pof(rie). In writing (25) we 
have made use of the fact that certain integrals like 
JS  g(ri2)p2(r1,r2)dridre vanish because g is odd. A term 
pod(r12) is contained in pe(r1i,re), and hence p(r) contains 
a 6-function at the origin. We define ~; and 2 by 


pr) =S(n)+A1(r), (26) 
pi(r) = pol 1+ p2(r)]; (27) 


these functions have no singularities and f2(r)—0 as 
r— o. Strictly speaking, in the definition of J, we 
should replace p2(ri,t2) by pe(ri,r2)—pod(r12) since g is 
always a function of the relative coordinates of two 
distinct atoms. To avoid unnecessary confusion, how- 
ever, it is easier to think of g(r) as becoming zero for 
sufficiently small r. Similar remarks apply to ps3 and pa 
when they occur in J, ---, J1o. If one does not wish to 
think of g(r) as being modified near the origin, then the 
p’s should be understood as containing delta functions 
of all coordinate differences except those which appear 
as arguments of g in the same integral. The probability 
in the ground state that atoms are at mr, re, and r; is 
p3(r1,¥2,63)¢ri;dredr3. The nonsingular part of p3, which 
we call p3’, is defined by 


p3(11,¥2,03) = ps (11,8 2,03) + pops (712)8 (23) 
+ pohi (11s) (r12)+ pops (123)5 (113)-+ pod (r12)d (ras). 


No experimental data for p;’ are available. If any of the 
mutual distances, say rie, is large, then 


(28) 


ps (11, 82,13) = pops (113) Pi(res). 
If any of the interatomic distances becomes less than 
2.4 A, then p;’=0. The approximation® 


ps (11,82,%s)—=P1 (712) pr (7:13) pi (723) (29) 


has these correct limiting features. Much has been 
written about the validity of this approximation; for 
some, but not all purposes (29) is quite sufficient. We 
shall see that our answer is only slightly sensitive to the 
difference between the right and left sides of (29), 
Furthermore, we shall be able to estimate the magnitude 
and sign of the errors due to (29). 

I;, Is, and I, are independent of &. In the other 
integrals it will prove possible to extract most of the 
k-dependence rather simply in the roton region, the 
remaining complicated terms being very small. This 
means that the computation of the entire roton 
spectrum will not be much more difficult than the 
computation of one point on it. We now discuss the 
evaluation of the various integrals. 


18 This is sometimes called the Kirkwood approximation or the 
superposition approximation. 
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(b) Evaluation of f,, Iz, and I, 


J, can be done exactly. We integrate by parts over 
a volume bounded by two concentric surfaces, one 
lying inside the radius where f(r) =0 and the other very 
far from the origin. The inner surface contributes 
nothing, but the integrand gi(r)fi(r) falls off only as 
r*, with the result that the outer surface makes a 
finite contribution, which is easily computed to be 
— (8rpo/3). We eliminate this contribution by rede- 
fining g(r) to have a decay factor, say e~*, with very 
small «, which makes surface terms vanish at infinity. 
This procedure is mathematically legitimate, since we 
are free to use any wave function we want in the 
variational principle, and is in accord with the physical 
idea that all the momentum of the backflow should be 
contained in a finite volume. It will generally not be 
necessary to represent e« explicitly; the convergence 
factor will be used only to justify certain operations. 
After the integration by parts, there remains 


I= (2k/B)- f gale) Veu(ryat 
= (8x/3) [ [dpi(e)/arJar=8rp0/3. (30) 
0 


In the last integral, the integrand should really be 
e-"(dp,/dr), but if € is small enough the convergence 
factor will be unity out to radii where dp,/dr becomes 
negligible. 

After performing the angular integrations in J2, we 
find 


Ta=16np0 f 11 jo(kr) [1+ po(r) Jar 
ro 


= L6mpoL (kro) 171 (Aro) + F(R) ]= 16rpol 20, 


where ro is any radius inside the region where #;(r) =0 
[we take ro= 2.4 A, the radius where f1(r) first becomes 
positive] and 


(31) 


F(k)= f 1 jo(kr) po(r)dr. (32) 


TO 


In order to do integrals like J, and J;, we need to know 
the value of J2,(k) for all &. Using tabulated values for 
the spherical Bessel function 72, F(#) was evaluated 
by numerical integration for 23 values of k between 0 
and 7 A~, Figure 3 gives the results for I2.(%). For 
k<1.5A7, F(R) is negligible compared with (kro)! 
xy (kro) ‘ 

One might expect from (25) that Jz —J, as 
k— 0. As k approaches zero, F(R) approaches zero and 
ji(kro)/kry approaches 4. Comparison of (30) and (31) 
thus shows that Jz approaches 2/, instead of — Ji. The 
reason for the discrepancy is that (31) is wrong when 
& is very small, of the same order of magnitude as e; 
in this case we must take account of the term e~* in g, 
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and there will be a correction term which will cause J, 
to change from 16zp»/3 to — 8mp)/3 as k decreases from 
eto0. 

Iz can also be evaluated in momentum space, using 
data for S(k) rather than f(r). In momentum space the 
integrals converge best for small & rather than large k. 
The results are not very important because (31) is 
useful down to k=0; but they do provide a check of 
our numerical work and also of the consistency of the 
data for p(r) with that for S(#). S(R) was defined by 
(10) as the Fourier transform of f(r), where f(r) 
includes a delta function at the origin and a constant 
term pp at infinity. Therefore S(k) > 1 as k— © and 
S(R) includes a term (2:)%po5(k). We define 


S1(&) = S(k) — 1— (21) p05 (A). (33) 
It follows that 
pope(r) = (2n)-* fet +5, (R)ak. (34) 


Taking the Fourier transform of Vgi(r), we obtain 
after the angular integrations 


Ty=16mp0/3+ (2/m) f "Si (ki)b2b(hi/k)dks, (35) 
0 


where 


»( 5 Ey a4 1—x 
x)=- — —— —(1~2*)? lo 
6 2 4x sa 


The numerical integral in (35) was evaluated for 
k=0.5, 1.0, 1.5, 2.0 A~!. Convergence is good, and the 
values are accurate to within a few percent. Never- 
theless, (35) does not give accurate values of f2 when 
k>1.5 Aq, because for large & the cancellation between 
the two terms of (35) is almost complete (as it must be 
because J; 20 as k-—> ~), and hence a 3% error 
in the numerical integral may cause a 30% error 
in I,, We take the volume per atom of liquid helium 
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Fic. 3. Jea(z) is the Fourier transform of p(r) times the velocity 
distribution in the pattern of backflow around a moving atom. ; 
the most important region (k<1.5 A7}), J2a() would be unchangé 
if we took p(r)=0 for r<2.4 A and p(r)=po for r>2.4 A. 
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as 45 A3.4 The following table compares the values of 
j.(#) obtained from (31) with those obtained from (35). 
RAnY 0 OS 10 15 200 @ 


j,(k) (A) from (31) 0.372 0,322 0.194 0.054 —0.036 0 
: from (35) 0.372 0,326 0.200 0.050 —0.060 0 


The discrepancy at k=2A™ is not serious, for the 
reasons just mentioned, and the agreement elsewhere 
js sufficiently close for our purposes. The values derived 
from (31) are used throughout our work. 

Is presents no problem if I2.(%) is known for all &. 
Using (28) and the approximation (29) for p;’, and 
Fourier analyzing f2(re3) with (34) we obtain 


2 


r= f pun tve(e) Part f p(n vested] 
+ (28)-*(p9)! f huSs(h| f exp(iki-1) 


2 
xp(nver(e)dr| . (36) 


The integral in square brackets is a generalization of 
I, to the case where the k in the exponential has a dif- 
ferent direction from the k in g:. The angular integra- 
tions are easily performed, yielding 


T= Sapo (t+ bale) dr (dpo/3)* 
ro 


+166 f kyS1(F1)[T2a(h1) Paki 
0 


=0.01190+0.00867 — 0.00790 =0.01267A-*. (37) 
The value obtained for J, may be in error because of 
uncertainties in the values of f(r) and S(&), and also 
because the approximation (29) is not exact. Discussion 
of the error due to (29) is postponed until the evaluation 
of J3. The uncertainty in f2(r) is unimportant because 
the magnitude of fro” fo(r)r—‘dr is only 1/10 that of 
Jr" r—‘dr. Similarly, 90% of the contribution to 
So? ki2S1(R1)[Toa(R1) dk: comes from the region ky 
<1.2A™. In this region J2.(R1) is the same as would 
result if we took ,(r)=0 for r<2.4A and pi(r)=po 
for r>2.4A, and Si(R;) is largely determined by its 
value and slope at the origin, both of which are known 
theoretically. The important point to be learned from 
this discussion is that the values of J, and of the other 
integrals which contribute significantly to the coef- 
ficient of A? in (23), depend mainly on the gross 


4 The atomic volume of liquid He under its own saturated vapor 
pressure at 0°K is 46 AS, but 45 At is closer to the value at 2.06°K, 
where the structure factor data was taken. Internal inconsistencies 
would develop if po and S(z) were taken at different temperatures. 
One might ask where the theory takes account of the external 
pressure. The pressure determines the values of po and, more 
important, So. An increase in pressure is expected to sharpen 
the maxima and minima of p(v) and S(&). 
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features of p(r) (i.e., its delta function at the origin, 
vanishing for r<2.4 A, and its quick approach to the 
asymptotic value po) and not on the details of its 
behavior. The coefficients of A are more sensitive to the 
detailed behavior of f(r); it is the detailed behavior 
which determines the location of the minimum in the 
energy spectrum. The insensitivity of the quadratic 
coefficients to the exact form of f(r) can be similarly 
verified in the computations which follow, and will not 
be pointed out explicitly. 


(c) Approximate Methods 


The value of J, and the size of the various terms which 
contribute to it can be understood fairly well in terms 
of some simple approximations for integrals involving 
the coordinates of three atoms. With the help of these 
approximations we can understand the sizes of all the 
remaining integrals; if we know that an integral is 
small, it will not be necessary to waste time in evalu- 
ating it very accurately. 

Suppose we want to do an integral of the form 


f soa) flenoa(t 2,3)ard 


(in this integral we shall understand p3 to include a 
delta function on coordinates 2 and 3, but not on any 
other pairs). If the positions of 1 and 2 are fixed and 3 
is not too close to 2, then p3(1,2,3) can be approximated 
very closely by pofi(r21) fi(731). We write 


p3(1,2,3)—pops (r21) pr (71). (38) 


When 3 approaches 2, this is wrong because p3 goes to 
zero but 1(r21)pi(r31) keeps a finite value (assuming, 
of course, that r2:>2.4 A; otherwise both expressions 
are zero). When 2 and 3 coincide, however, p3 exhibits 
a delta function and far exceeds pofi(r21)p1(r31). The 
strength of the delta function is such that if we integrate 
the difference between the two sides of (38) over the 
positions of 3, the result is exactly zero, i.e., 


ftsa.23) — popilras)pi(rs1) \drs=0. (39) 


We believe this equation not to be a relation among 
distribution functions in general, but to hold for the 
distribution functions for the liquid at absolute zero. We 
do not have a rigorous proof, but shall discuss our 
reasons for believing it in Appendix B. 

If f(r) is a slow-varying function, i.e., f(r) does not 
change much when r changes by 2.4 A, then for a fixed 
value of riz the value of f(rs:) is almost constant over 
the region where the two sides of (38) differ appreciably. 
Using (39), we see that the integral 


f fem)Coa(1,2,3) — pofr(ra1)Pa(ran) Jers 
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is very close to zero. We therefore find 


ff Heea)ps(1,2,3)drs~pobs (ra) f fowprlraddrs, 


and finally, 


J Hen fomost.a,derairexol f seo p(nae] . (40) 
Similarly, if f or g is slowly varying, 


f flrareleayoa(1,2,3)arary 


~m( f pain) ( fecinar). an 


Actually, our criterion for a slowly varying function is 
too stringent. The behavior of f(r) for r<2.4A is of 
no importance, since #:(r) is zero in that range; hence 
f may be singular at the origin. The important question 
is, how much does f(ri+re2) differ from f(r) when rn 
and re are any two vectors of length 2.4 A? And even if 
the difference is large compared with f(r), (40) is 
still good if f(r) is such that the major contribution to 
S f(r) pi(r)dr comes from r>3 or 4A. 
Another type of integral which interests us is 


f Hexngem)ACea)ox(1,233) drodrs, 


where f and g are smooth and A(r) oscillates so rapidly 
that it produces almost complete cancellation when 
integrated against f1(r). ps3 is still understood to contain 
a delta function on 2 and 3, and on no other pair. In 
this case, if 1 and 2 are held fixed and 3 is allowed to 
vary, the oscillation in A(1e3) make the contribution to 
the integral small. The major contribution comes when 
3 and 2 are tied together by the delta function and we 
find 


f fog (rth (eas)on,2,3)dradrs 
~poh(0) f flryg(e)pilrydr. (42) 


If Vgi(r) is sufficiently smooth, (40) can be used to 
estimate I4. The answer thus obtained is (4:p)/3)? 
=0.00867 A~*, which is the middle term of (37); if 
Vgi were very smooth, the first and third terms would 
cancel completely. The first term (0.01190 A-‘) is larger 
than the third term (—0.00790 A‘) because Vgi(r) is 
proportional to r~* and therefore quite strongly peaked 
for small r; hence the delta function more than com- 
pensates for the “hole” in p3. The answer given by (40) 
is ¥ the correct answer. 

With the aid of (40)-(42) we can discuss the remain- 
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ing integrals more intelligently. If Landau’s energy 
spectrum is even qualitatively correct, then the most 
important points to compute are those in the neighbor- 
hood of the roton minimum. The phonon spectrum is 
guaranteed to be correct; and when the temperature is 
high enought to excite the portion of the spectrum 
lying appreciably above the roton minimum, then the 
picture of the liquid as a gas of independent excitations 
has broken down. Thus, even if we knew the exact 
form of the high part of the spectrum, we would not 
know how to do the thermodynamics. Furthermore, the 
high-momentum end of the spectrum computed with 
(20) or (21) is certainly wrong, since the slope dE/dp 
exceeds the velocity of sound when k>2.2 A-!; when- 
ever |dE(p)/dp|>c, there obviously exist states with 
two excitations, one of which is a phonon, which have 
total momentum # but energy less than E(~). We shall 
therefore compute the energy at several points in the 
region 1.6A?<k<2.4 A, and also at R=1.2 A~ in 
order to estimate the height of the hump between the 
phonon and roton regions. 


(d) Evaluation of J; and Correction to the 
Kirkwood Approximation 


Since g; 1S smoother than Vg, J3 is a good candidate 
for the approximation (40), which predicts J;=0 
because {p1(r)gi(r)dr=0. We infer that J; is small; 
but it is important to know how small, because the 
factor k? which multiplies J; in (23) is fairly large. The 
exact value of J; [i.e., no approximations beyond (29) ] 
can be computed by the method used for J4. The result 
is 


L3=Lsatl 3 


ae flaca (r)dr+ fos (re1)g1 (131) fr (21) 
X Pilrsi) po(res) dradra 


=o (4/3) f “Lit pale) dr 


+ (8/3) if Si(&)oa(k) Pak 


= (1/45) (1.707 — 1.470) =0.0053A~*. 


The integral I9.() is defined by Eq. (49). The approxi- 
mation (40) is based on the idea that J3, and J3, should 
cancel each other. Since I3a—Im is only 14% of 
Isa, the idea behond (40) is good, but (40) tells us 
nothing about the size of 3 because S gi(r)p:(r)dr=9. 

If 7;=0.0053 A~‘, then in the roton region the term 
#Is contributes about half of the total coefficient of A” 
in (23). Any possibility of serious error in J; ought 
therefore to be investigated carefully. The idea that /: 
is almost zero is based on the approximation (40), 
which in turn is based on the identity (39). Actually, 
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the approximate form which we have used for pz does 
not satisfy (39) exactly. Slight departures from (39) 
ordinarily would nat affect the validity of (40), were it 
not for the fact that /p1(r)gi(r)dr=0. In this case 
the question arises; how much of the failure of J3 to 
yanish is real, and how much is due to the fact that the 
approximate p3 does not satisfy (39)? An exact ex- 
pression for J is 


15= 0.0053 — pe f es(cagr(radLaa(ra)pa(rad Palen) 


—p3'(1,2,3) Jdredrai 


=0.0053 —I se. (43) 


If any one the mutual distances is less than 2.4 A or 
more than about 4A, then pi(r21) p1(r31) p1(r'23) — ps’(1,2,3) 
is very close to zero. Consequently, the integrand of J3, 
is appreciable only if the three atoms are at the corners 
of a triangle, each of whose legs may vary in length 
from 2.4 to 4A. Therefore, if the spatial variation of 
gi(r) were slow, the replacement of gi(rei)gi(rai) by 
[g1(ra1) ? would not greatly alter the value of J3.. The 
resulting integral is then easily evaluated. We can, 
however, find an even better approximation to J3. by 
taking the angular variation of g,(r) into account. 
Since J, is independent of the direction of k, we can 
average the integrand over the directions of k. The 
average of (k-ric)(k-rs1) is 3f’rai-rsi, and in the 
important configurations the three atoms almost form 
an equilateral triangle; therefore, the average over 
these configurations of the cosine of the angle between 
ro, and rai is very close to 3. Most of the angular 
dependence of the integrand is therefore correctly 
accounted for if we replace rei-rai by $7ei%a1; at this 
stage we note that the radii ro: and ra, are almost 
equal in the important region, and we take ro; and r3; 
to be the same in the integrand. This approximation 
differs from the preceding one through the presence of 
the factor 3. We obtain 


I3e~T3a, 


Ise (rygh 202 pe? fdre Pp (tra) Dy (r23) 


Iselr) (A) 


Fic. 4. 73.(7) measures the error in the Kirkwood approxima- 
tion, and would vanish for all 7>2.4 A if the approximation were 
exact. The rapid decrease of J;.(r) accounts for the high accuracy 
of the Kirkwood approximation in this computation. 
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where 
Izq= (600) f rar“Laa(ra)PalradPalres) 

—~p3'(1,2,3) ]dredray. 


The identities (39) and (70) imply that 
f arLostradalradpalres) -0s'(1,2.3)) 


= pop (ria) f pslresibalruare 


We find : 
Tuu= Gro [ rtp ulridr, (44) 

where - 
Taaltas)=2rot fdrspalrs) ales) (45a) 


=f ers sin(kris)LSi(k) PR'dk.  (45b) 


Ise(r) was computed from (45b) for 19 values of r 
between 2.4A and 5.6A. The numerical integrals 
converge well, and the results are shown in Fig. 4. 
Performing the final numerical integration in (44), we 
find J3g=0.00040 A“ and finally 


T,=0.0049 A~*, (46) 


The smallness of the correction to J; shows that the 
slight failure of (39) does not cause a significant error 
in J3. This fact was not intuitively obvious, however, 
and needed verification. It should be emphasized that 
we have gone beyond the Kirkwood approximation. We 
have written an exact expression J;, for the error due 
to the Kirkwood approximation, and we have estimated 
Iz. quite accurately by an integral J3g which is easily 
evaluated. We believe the inaccuracy in the approxi- 
mation I3c+J3a to be about 25%, and therefore our 
lack of knowledge of 3 causes a residual uncertainty 
of 0.0001 A~ in the value of J3. 

By exactly the same method, one can estimate the 
error in J, caused by the Kirkwood approximation. The 
answer is 0.0001 A~*, which is negligible compared with 
the value given by (37). 


(e) Evaluation of Remaining Integrals 


I, occurs in (24) as a coefficient of A, rather than A?, 
and ought therefore to be treated as accurately as is 
possible. In J», p3 includes delta functions on rie and 
r13. Using (29), and noting that several terms are zero 
because g: is odd, we obtain 


I= 21 f exp (tk- ry2) g1( 132) Po (113) P1(r12) 


x Pulran)dradta 25 f exp(k- Yi) g1(T12) 


Xpilrie)dre1. (47) 
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Iso (k} 


-0, 


Fic. 5. Isa(k) is the Fourier transform of p(r) times the velocity 
potential for the backflow pattern. Like J2(z) it is determined in 
in the most important region by the gross features of p(r). 


First we consider the second integral, since we must 
know its value for all k in order to do the first integral. 
The integral, like J:, can be performed in coordinate or 
momentum space; after the angular integrations are 
done, the result in coordinate space is 


f exp (ik: r)gi(r) pi (r)dr= (ki-&/R yk) 4rpotl 9a(R1), 


(48) 
where!® 


Toa(k) = (2.42) sin (2.42) +h f : pelt) ji(kr)dr. (49) 


As before, the coordinate space formula proves suf- 
ficient over the entire range of k. For small %, when the 
numerical integral cannot be done accurately, its value 
is so small as to be unimportant compared with 
(2.4k)~! sin(2.4k). Figure 5 gives the values of Ia(k). 
As in the case of J,(%), some points were also computed 
in momentum space, using data for S(z) rather than 
p(r). The results were in good agreement with the 
coordinate space computations. 

Since p1(riz)=pol 1+ p2(ri2)_], the first integral in (47) 
becomes 


pof exp (tk: r12)gi(132)P1(r32)L1+ po(rie) po (ris)droidrsi 
=pof exp (tk- rie) gi (132) 1(132) Po(ris)dreidrai 


ree f exp (dhe r12)g1(t52)1(r2) 


X po(rie)Pe(ria)dreidrsi. (50) 


1% Since fgi(r)p:(r)dr=0, one might expect the right side of 
(48) to approach zero as ky becomes small. But yq(k1) approaches 
unity for small %:, and consequently the right side of (48) 
approaches + ©, depending on the angle between k and ki. The 
trouble, as before, is resolved by noting that (48) and (49) are 
wrong for ki<e (g; should really gave a factor e~@ in it). In the 
correct version of (49) the term (2.4%)! sin(2.42) is replaced by 
zero when k«&e; hence Jga(k1) goes as ky? when &i<Ke, and the 
right side of (49) approaches zero. The “error” in (48) and (49) 
has no effect on our computations, but is worth mentioning lest 
the reader discover it and develop a distrust of the formulas. 
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The first integral on the right in (50) can be evaluated 
by writing exp(tk-rie)=exp(zk-1ri3) exp(zk-rso) and 
using the new integration variables rg; and rsp. The 
result is 


J exp (ik: 112) gi (152) pi (r32) Po(ris)dredrar 
= (4mi/k)Si(R)Ioa(k). (51) 


In the second integral on the right in (50) we use the 
integration variables re, and rg2; Fourier-analyzing 
po(ri3), we obtain 


f exp (tk- rio) gy (132) Pi (123) Po(rie) Po(r13)droidray 


= (ino f f dk d0kyS (1) S1({k+k;|) 


XIoa(k1) cos# sind, (52) 
where @ is the angle between k and ky. If we let 
u= |k+k,| = (2-+h2+ 2h; cosé)? 
the right side of (52) becomes 
~ k+ky 
(Qmpoik?)-? f des" oq (bs) S1(bi) 
0 [kk | 
1 x S1(u) (WR —~R)udu. (53) 
Since 
cc) k-+tky 0 ku 
fi aeseen J etdau= fi aug f feats, 
0 lk~k1| 0 [k—u| 


a numerical integral like (53) can be done in several 
different ways. We look for the way which converges 
fastest, is least sensitive to information which we do 
not have [like the value of Si(z) for large «], and does 
not involve small differences of big terms. For example, 
we should avoid dealing with the indefinite integral of 
Si(u)#, since it oscillates badly for large #; hence (53) 
is not convenient to use as it stands. Probably the best 
form of (53) is 


Orpstty| f wS \(u) (u?—k*)du 


_ uth 
x ART 9a(h1)S1(k1)/h1 


luk] 


20 uth 


=f uSa(udu f dbo) SiC) 
0 |u-kj 


= (Impotk®)“LTon(k)—Ioc(#)]- (54) 
In this form, the inside integral acts as a convergence 
factor for the integrand of the outside integral, and the 
answer is not sensitive to the values of 51(z) for large #. 
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The inside integral can be tabulated once and for all 
as an indefinite integral; thus, the evaluation of (54) 
‘nvolves only a single numerical integration for each 
value of 2. 

If (41) were used to estimate Jy(k), the result would 
be 


1o()=—2 f e**prlrar f ex(n)pr(rar=0. 


As in the case of J3, the question arises: how much of 
the failure of J5(%) to vanish is real, and how much is 
due to the failure of our approximate p; to satisfy (39)? 
The analysis proceeds exactly as with J3, and we find 
that the quantity 


Tra(B)= (2/00) f pilrirbe)Tselr)dr (58) 


should be subtracted from (47). Combining Eqs. 
(47)-(55), we obtain 


T9(k) = (81 po/k)Ioa(h)S(R) 
+ (rk?) [Tec(k) —I9n(k) ]—Toa(hk). (56) 


Table I gives values of I9(k), Igo(R), Isc(k), and 
Ioq(k). In the roton region the correction Iq is about 
one-tenth as large as Jy (except near k= 2.4 A~', where 
I, is negligible anyway). Since we believe that Joa 
estimates, within an accuracy of 25%, the error due to 
the Kirkwood approximation, the residual error in Is 
due to this source is probably only 2 or 3%. 

Using (29), we can write J; as 


cc) 


t= f pln tven(n)Par+| f co pinvete)ar| 


+f exp (¢k- reg) Vgi (Tei) » Vgi( a1) pi (r21) 
X pi(rst) Po(res)dreidray 


=Tpat+L oot Ic. (37) 


The oscillatory factor exp(tk-res) makes Js a likely 
candidate for (42), which says!® Is~Jsq. At the cost of 
considerable labor we have computed Js,+Js_ when 
k=2A™ and when k=1,2 A~!, and verified that it 
could indeed have been neglected. 

Isq has been evaluated in connection with J, From 
(31) and (25) we obtain 


Ts5(k) =(8mpoloa(k) ). 


Ts-(R) can be evaluated by the same methods used for Jo. 
The resulting expression is similar in form to (54), and 
will not be exhibited here. Laborious computations give 
Isx(2 A7)+J 5e(2 Am!) = —0.0010 A~é 
and 
Tep(1.2 Am) +Je(1.2 A7+) =0.0010 A~®. 


'6 Compare (42) with the definition of Js in (25). 
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TABLE I. Values of numerical integrals involved in Jy(z). 


Ton (k)(A~) Tue(k) (A~4) Toa(k) (A@2) Io(k) (A723) 
R(A™) [See (S4)] [See ($4)] [See (SS)] [See (56) ] 
1.2 — 0.019 0.172 0.0076 0.0444 
1.6 — 0,387 0.089 — 0.0002 —0.0054 
18 — 0.406 0.031 —0.0035 — 0.0386 
2.0 — 0.364 ~0.017 — 0.0047 — 0.0518 
2.2 —0,086 ~0.048 — 0.0039 —0.0307 
24 0.155 ~0.054 — 0.0020 — 0.0044 


Since I5,=0.0119 A~®, the complete omission of J5, and 
J, would not cause a serious error in the roton spectrum. 
We shall omit these terms while locating the minimum 
of the spectrum, and shall reinstate them in the final 
computation of A. 

I, is estimated by (41) as zero. As in the case of J3, 
it is important to find out whether J, is really small 
enough to be neglected. To obtain a more accurate 
estimate, we write 


Te=Ieatlos, 


where 


Tea= (24/2) f eie-tg, (Qk Ver(o pirat, 


Tea= (24/h) f exp(h-nidei(ta)k 
-Vgi(tis)pi(rar) pr (121) po(ro3)draidro1. 


According to the discussion preceding (41), Ia, and J, 
will cancel each other almost completely, so J. is some 
fraction (probably about one-fourth) of J¢a. Performing 
the angular integrations in J5,, we obtain 


Tum 8e drrp,(r)[ —2(kr) cos(kr) 


+8(kr)~ sin (kr) +18(kr)- cos (kr) 
~—18(4r)~* sin (Rr) J. 
A rough numerical integration gives 
Tea(2 A)~—0.003 An®. 


Hence J6(2 A)~~—0.001 A~®, and kIe~—0.002 A~* 

when k=2A7). Since R7J3+J4+Jso¥0.040 A~® near 

k=2 A“, we can neglect kJ, without much error. 
Estimation of I; by (41) gives 


Tee 2 (— 4 rpo/3)8pol 2a(k) = 3 (8mp0)*Tea(k). (58) 


Considerations similar to those used in estimating J, 
show that (58) is accurate to better than 0.001 A~*. 
When 2 is in the roton region, the major portion of Jo 
comes from the term 6(ri2)p3(2,3,4), which is contained 
in pa(1,2,3,4). When 1:20, the oscillations of e*-12 
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make the contribution to the integral very small.!” If 
we neglect all of ps except 6(riz)p3(2,3,4), we are making 
essentially the approximation which was used in J; and 
was shown to be very accurate there. We then obtain 


I woul 3 (59) 


and our evaluation of the integrals in (25) is completed. 
The oscillation argument which leads to (59) fails 
when & is very small. For any value of & the require- 
ment that the normalization integral g have no roots 
when considered as a polynomial in A leads to the 

inequality 
Ti9> I9?/4Is. (60) 


The failure of (59) for small & is most easily seen by 
noting that J; becomes much smaller than the right 
side of (60) as k—> 0. 

For k>1.2A7, the coefficient of A? in (23) is 


E(k) gee (LitTe|+ ALR st Lat ea+ 3 (8190) J 20] 
E,(k) 1+ ARI o/s ]+A?[R3/Is ] 
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estimated well by 
BI s+IatIsat4 (81 p0)*I 20 (%)- 


We have omitted Js, Js., and Al¢, and have approxi- 
mated J7 by (58). J. and kJ, have both been shown to 
be very small, and are both difficult to compute; 
omission of these terms simplies the computation of the 
energy spectrum, and does not significantly change the 
location of the minimum. J has been omitted for the 
sake of consistency, since it is even smaller than J... 
After locating the minimum, we shall reinstate the 
omitted terms in our final computation of A. We esti- 
mate Ii) by (59). 


6. THE IMPROVED ENERGY SPECTRUM 


With the omissions and approximations mentioned 
in the preceding paragraph, we obtain 


1+A[0.186+ 1.1177 2.(%) ]+A°(0.0246+0.0049%?+-0.108724(z) ] 


E(k) is the true lowest energy of a state having 
momentum #k; E(k) is the energy computed with the 
wave function (5), i.e., E1(k) = 7#2#/2mS(k) ; E2(R) is the 
spectrum we have computed, subject to the omissions 
and approximations noted above. 

For k=2 A“, (61) becomes 


Ex(k) 1+0.1494 +0.0406 A? 
E,(k) 1—0.0822A-+0.0156 A? 


(62) 


The first attractive feature of (62) is that the coef- 
ficients of A in the numerator and denominator have 
opposite signs, so that the denominator increases while 
the numerator decreases. The optimal value of A is 
~—3.5, which is very close to the classical value A= 
—3.6. The minimum value of E2(2)/E,(2) is 0.659, 
corresponding to E,(2 A™1)/x«= 12.6°K. 

Computation of the coefficients in (61) and minimi- 
zation of the resulting expressions yield the results 
given in Table II. We estimate the minimum value of 
E.(k)/« as 12.0°K, corresponding to k=1.85 A“. If 


17 Tf the integral 
I(r2)= (0) f p4(1,2,3,4)63 (ta1)B1 (tua) dead, 


were to become large compared with J (0) as ri2 grows large, then 
the growth of J might offset the oscillations of exp(ik-ri2) and 
(58) would be wrong. It is easy to see, however, that as 1 and 2 
go farther apart, J(ri2) approaches [f'pi(r)gi(r)dr?, which is 
zero. Since the factorization of J into a product of two integrals 
becomes more nearly exact as ri2 increases, it is very plausible 
that J decreases with increasing riz and is largest when 1 and 2 
coincide. In the latter case, J is equal to J;. 


1+A[kI9(k)/S(k) + A7[0.00498?/5(k) ] 
= Ex(h)/E i(k). 


(61) 


we eStimate J5,(1.85 A~?) and J6(1.85 A!) by the 
values of the corresponding integrals at k=2 A™!, we 
find that the coefficient of A? in the numerator of (61) 
should be diminished by 0.003 A~® when k= 1.85 A“. 
This change lowers the energy by 0.5°K and we obtain 
the following as the final result of this computation’: 


po/h=1.85A7, A/e=11.5°K. (63) 


It is evident from Table II that E;(k)/E,(k) passes 
through a minimum near k=1.2 A~!. In any correct 
theory E2(k)/E\(%) must approach unity for very small 
k because we cannot lower the energy of a phonon. By 
studying the behavior of the integrals in (25) for very 
small &, we have verified that our spectrum does indeed 


TaBLE II. The energy spectrum £2(k) computed from (61).* 


R(A) Aopt (A) Ex(k)/E1(k) Ex(k)/«(°K) 
1.2 —3.6 0.569 14.08 
16 —3.7 0.576 13.44 
1.8 —3.6 0.594 12.00 
2.0 —3.5 0.659 12.59 
2.2 —3.0 0.730 16.86 
2.4 —2.5 0.791 24.04 


* E:(k) is essentially the spectrum computed here. Some further small 
corrections lower the minimum energy to 11.5°K. E1(k) is the spectrum 
previously computed with a simpler wave function. Aopt is the optima! 
value of the strength of the return flow in the wave function (21), and is 
chosen so as to minimize E2(k). The values of Aopt are close to the “classical 
value —3.6 A? computed from a current conservation argument, 


18 A similar result has been obtained from perturbation theory 
by C. G. Kuper, Proc. Roy. Soc. (London) 233, 223 (1955). As 
he points out, the perturbation theory is not reliable because of 
the large size of the energy change. 
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have the correct limiting behavior.® A more direct way 
of seeing the result is to look at (21) [or (20) ] when 2 
is very small. The correlation term g(r.j) is significant 
only when atoms 7 and j are fairly close. But in this 
case exp (ik: 1r,) and exp(zk- r;) are almost equal because 
z is small, and hence the correlation terms cancel 
almost completely because g is odd. Thus, (21) is 
almost the same as (5) for small &, and leads to the 
same energy. 

For high k, E2(k)/Ei(k) approaches unity because 
the approximation exp[i }> g(rji) 1+ DO g(rj:) fails 
padly. We noted earlier that if we could compute with 
the wave function (20), the interference between terms 
with different 7 would vanish when & is large. If E3(k) 
is the energy arising from (20), we should find that for 
large k, 

E;(k)/E:(k) =0.65, 


as in the foreign atom problem. It is amusing to con- 
jecture on how much £;(k) might lie below E.(k) 
when k= 1.8 A~!. The accuracy of the approximation 
exp(i 4) g)=1+7 ¥ g in the foreign atom problem (see 
reference 11) suggests that E3 may be 0.5° less. 

The energy spectrum E,(k) is shown in Fig. 6 as 
curve A. We have also plotted B: E,(k)=#k?/2mS(k) ; 
C: de Klerk, Hudson, and Pellam’s spectrum [Eq. 
(4)]; D: spectrum of the form (2), with A/«=9.6°, 
po/h=1.85 A~! and uw chosen so that yup? has the 
same value as in C. (The specific heat depends on yu 
and /» only through the product y}..) From the 
curvatures of A, C, and D it is clear that our spectrum 
E2(k) predicts too small a value of u. In a computation 
of this sort, however, it is doubtful that the curvature 
has much significance. 

Curve A brings out the fact that the “hump” between 
the phonon and roton regions is not nearly so high as 
one might expect from (1). Consequently, when com- 
puting the specific heat or normal fluid density at 
temperatures high enough to excite rotons, it is probably 
also necessary to take into account the deviations of the 
phonon spectrum from linearity (and also the devia- 
tions of the roton spectrum from pure parabolic be- 
havior). Qualitatively, it appears that such corrections 
might improve the agreement between the theoretical 
spectrum and the specific heat and second sound data. 


7. DISCUSSION OF ACCURACY 


Initially, the major potential sources of error in this 
computation were (a) the absence of information about 
the true form of p3(1,2,3); (b) absence of information 
about ps(1,2,3,4); (c) uncertainties in the data for S,(&) 
at large k (see Appendix A). 

The uncertainty caused by (a) has, we think, been 
minimized by the introduction of a correction to the 
Kirkwood approximation. The errors remaining in J; 


Tf g(r) falls off sharply at large 7, the analysis is simple. In 
our case the analysis is complicated by the slowness with which 
k-r/r® falls off, but the ultimate result is the desired one. 
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Fic. 6. The energy 
spectrum of excita- 
tions. Curve A is the 
spectrum E(k) com- 
puted from Eq. (61). 
Curve B is the spec- 
trum £i(k) com- 
cuted with the sim- IB 
pler wave function 
(5). Curve C is the 
Landau-type spec- 
trum used by de- 
Klerk ef al.4 to fit the 
second sound and 
specific heat data. 
Curve Disa Landau- 
type spectrum with 
po taken the same as 
in A, and » and A 
chosen to fit the 
specific heat data. 

‘or small &, all 
curves are asymp- 
totic A the line 
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and J» after the correction are probably less than three 
percent; the resultant error in A/k is less than 0.3°. 

The approximation (59), which gives rise to the 
error (b), ought to be about as accurate as the approxi- 
mation Js~Jsa, since both approximations are based 
on the same oscillation argument. The latter approxi- 
mation was found accurate to better than 10% in the 
roton region. A ten percent change in Jy) would alter 
the value of A/« by 0.2°; we regard this number as a 
fair estimate of the error caused by (b). 

Considerable pains were taken to arrange the nu- 
merical work in such a way that the answers are 
insensitive to the behavior of S,(%) for large k. The 
residual error due to (c) is found mainly in the coef- 
ficient of A? in the numerator of (61). This coefficient 
may be in error by 5%, and the resulting error in A/« 
might be as much as 0.4°. 

We consider the value A/x=11.5° to be accurate 
within 0.6°, i.e., the lowest energy computable with the 
wave function (21) is between 10.9° and 12.1°. 

A wave function which gives a good value of the 
energy may, of course, be inaccurate for calculation of 
other properties of the system. On the other hand this 
function was chosen by a physical argument, and 
achieved a very considerable increase in the accuracy 
of the energy, without in fact using any variable 
parameters. It might be argued that some of this 
increase should be associated simply with the fact that 
we have one extra parameter A to vary. But had we 
used the A determined by the physical argument (—3.6) 
we would have obtained practically the same energy as 
if we let it vary. 

For this reason we believe that the wave function 
(20) Lor for practical calculations (21)] not only gives 
the energy well but is a reasonably accurate physical 
description of the excitations. On the basis of this 
optimistic hope, (21) is currently being employed in the 
calculation of other properties of helium. 
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APPENDIX A. DATA USED FOR S(k) AND £(r) 


The curve for S(%) given in Fig. 1 is essentially that 
obtained from x-ray scattering by Reekie and Hut- 
chison.’.° The proper normalization of the data can, 
in principle, be determined from the fact that S(k) > 1 
as k-» «©, According to Goldstein and Reekie,® “limi- 
tations inherent in the very low scattering cross section 
of liquid helium and the experimental technique have 
prevented effective exploration (of the range k>6 A7}).” 
Since S(&) is still oscillating strongly at k=6 Am}, the 
normalization of S(%) is uncertain by a few percent. 
For k>2.5 Av~!, the percent error in S(k)—1 is large, 
and our computations would be totally unreliable if the 
integrals had not been set up in such a way aS to be 
insensitive to the behavior of S(z)—1 for large k. We 
feel that S(%) ought to oscillate about its asymptotic 
value, and have therefore taken S(k)=1 at a point 
whose ordinate is the average of the values of S(k) at 
the minimum near 3.4 A7! and the maximum near 
4.6 A~!. With Reekie’s normalization, S(k) is unity at 
an ordinate much nearer to the minima of the oscil- 
lations. Our normalization maximizes the cancellation 
at large & when we are performing integrals whose 
integrand contains S(k)—1 as a factor. Since S(k) is 
the Fourier transform of p(r), we find [see (34) ] 


— In2py= f ; RLS (k) —1 dk. (64) 


The relation (64) might serve as a test of the nor- 
malization of S(%), were it not for the fact that the 
numerical integral gives no sign of converging if we 
cut it off at k=6 A7!. The left side of (64) is equal to 
—0.43 A-%. With our normalization, integration of the 
right side out to k=6A™ gives +0.44 A-°, but the 
integrand is still oscillating wildly and there is a chance 
of ultimately converging to a correct answer. With 
Reekie’s normalization, integration of the right side 
out to k=6 A gives a positive value much larger than 
+0.44 A-*, and the contribution from k>6 A™ will 
also be positive unless the successive minima of (zk) 
cease to be closer and closer to the asymptotic value 
of unity. At any rate, the consistency of the results 
which we have obtained by performing the same 
integral in coordinate and momentum space convinces 
us that our $(&), which is 0.97 times Reekie’s, is suffi- 
ciently accurate for the present computations. 

Most of the curve in Fig. 1 represents data taken at 
2.06°K. According to reference 9, there is very little 
change in the values of S(k) for k>0.9 A! as the tem- 
perature decreases from 2.5°K to 1.25°K. Therefore, 
in the range k>0.9 A7}, it is probably safe to represent 
the zero-temperature structure factor S(%) by the data 
taken at 2.06°K. For k<0.9 A~!, the temperature 
dependence of S(%) is more important, and it is neces- 
sary to extrapolate S(z) linearly to zero by using (65). 
We have done this, using a slope about 20% higher 
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than the theoretical value in order to join the experi- 
mental data smoothly. The error thus introduced is 
small. 

Reekie and Hutchison’ have computed f(r) for 
r<6A by inverting their data for S(%). The curve for 
p(r) which we have given in Fig. 2 is obtained from one 
of their graphs” and, as has been previously mentioned, 
seems consistent with our curve for S¢k). The numerical 
inversion of diffraction data is not unambiguous, since 
the integrand of the relevant numerical integral is not 
small at the cut-off value k=6 A~. Furthermore, an 
arbitrary cutoff procedure must be used to make f(r) 
vanish for r<2.4A. More recently, Goldstein and 
Reekie® have employed an IBM 701 calculator to 
compute ~(r) out to 20 A, using the data of Reekie and 
Hutchison. Their article was not published until after 
the completion of the present calculation; the authors 
State that the results out to 6A “fully confirm” the 
results of reference 9. Goldstein and Reekie apply the 
integral test (69) to their curves for p(r) and find satis- 
factory results. Since the integrands do not become 
small until 7213 A, we found it impossible to apply 
the test to the curve in Fig. 2. 


APPENDIX B. IDENTITIES SATISFIED 
BY p(r) AND S(k) 


To understand the behavior of 5(k) for small k, we 
note that as long as we are concerned with disturbances 
of long wavelength (small %) the liquid can be treated 
aS a continuous compressible medium. If p(r,t) is the 
number density in such a medium and we define the 
normal coordinates 


Qx= fovea, 
then the energy is 
E=$ Dic Mul GuGi*torguge”, 


where w,=ck and m,=m/Nk*. Quantum mechanically, 
p(r) is replaced by the operator )>;6(r—r,) and qx 
then goes over into the quantum-mechanical normal 
coordinate gx= >> exp(tk-r,). S(R) is just 1/N times 
the expectation value of g,?. Since the average values 
of the potential and kinetic energies are equal for a 
harmonic oscillator, it follows that S(k)=(Ex)/me?, 
where (E,) is the average energy of the oscillator repre- 
senting sound of wave number k. When T=0, all the 
oscillators are in their ground states, and hence (Ex) 
=thw,=fzhck and 


S(k)=hk/2mc (small k). (65) 


When T}+0, the oscillator representing phonons of 
wave number k is no longer necessarily in its ground 


» Figure 2 was obtained by dividing the data of reference 8, 
Fig. 1, by r?. There are slight discrepancies between the resultant 
curve for p(r) and the curve given in reference 9, Fig. 3. Errors of 
this magnitude in p(7) would have a negligible effect on our 
results. 
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state, but may be in its mth excited state with prob- 
aed proportional to exp(—£,/«7). It follows that 
p= 1/kT) 


S(k) = (hk/2mc) coth+Bick (66) 
= (B8mc?)!-+ (B2?/12m) P— (66a) 
From (66) there follows immediately the famous formula 


limS (&) = pox7 xr, (67) 
k-0 

where po is the number density and xr the isothermal 
compressibility of the liquid. When wick becomes 
greater than «T, (66) becomes essentially linear in &. 
Strictly speaking, however, S(£) starts quadratically 
from a nonzero value except when 7=0. The possi- 
bility of a linear behavior of S(4) for small &, as pre- 
dicted by (65) when T=0, has been sometimes 
questioned on the basis of (10). From (10) it follows 
that 


) 


S(&)-1 = an f [e(r)—po](ar)7 sin(Ar)r'dr. (68) 


Since p(r)— po approaches zero for large 7, it is argued 
that .t is legitimate to expand (4r)~! singr as 1— (4r)?/6 
+--+, Integrating term by term, one finds 


S(k)— 1 =CitC.#+ ar) (68a) 


where 


mre f oes 
ot 


G=-4 “f. Le(r) polr'dr. 


Hence it appears that S(&) always starts quadratically 
in &. The fallacy in the argument lies in the fact that 
p(r) may not approach its asymptotic value fast 
enough, and the expansion may be meaningless, For 
example, if p(r)—po decreases as r~* for large r, (68) 
converges perfectly well but C, and C, are infinite. 
When T=0, p(r)—po falls off slowly enough to invali- 
date the expansion, and (65) is correct; at any finite 
temperature p(r)—po ultimately falls off exponentially 
and the expansion (68a) is legitimate. One might think 
that all the coefficients of (68a) can be determined by 
comparison with (66a); this is incorrect because (66) 
is wrong for large &. Using (67) and (68), however, we 
do obtain the important result 


-) 


1-+-4r [i(7)— po lr’dr = pox T x7, (69) 
ot 
and when T=0 
tar f [e() — po jdr=0. (70) 
ot 
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The result (69) can also be obtained by rather simple 
classical arguments. It follows directly from the defi- 
nition of p(r) that the left side of (69) is ((V—N)*)y/N, 
where N is the number of atoms in a large subvolume 
of the liquid, and the bar denotes “average,” but statis- 
tical mechanics shows that ((V’W—WN)2\x/N=poxTxr, 
whence (69) follows. 

One might think that (70) is a simple consequence 
of the definition of p(r). For if an atom is known to be 
at r;, the probability that there is an atom at rq is 
p2(t1,%2)/pi(t). If ri is not near the surface of the 
liquid, then p:(ri)= 0; if r1 and re are both far from 
the surface, then p2(ri,r2)=pop(ri2). If we integrate 
p2(11,%2)/pi(ts) over all locations re, excluding the point 
ri, the answer must be exactly V — 1. But if we integrate 
pi(r) over all positions of r, the answer is exactly N. 
Consequently 


f Leo(ri,r2)/pi(t1) —pi(t2) Jdra=—1. (71) 
ro¥r} 


If we take r, far from the surface, i(r1) can be replaced 
by po. Furthermore, the integrand is appreciable only 
when rz is near m, in which case p2(r1,T2)/p1(11) — pi(re) 
= p(ri2)— po (there are no complications at the surface 
of the liquid since the surface corrections to both terms 
of the integrand are identical). Then (71) reduces 
exactly to (70). 

Something must be wrong with the preceding argu- 
ment at finite temperatures, since (70) is false if 70. 
The difficulty lies in the fact that, at finite 7, the limit- 
ing value of p2(r,r2)/po for large rio is not po, but is 
slightly lower by an amount of order 1/.V. Since re runs 
over a volume proportional to 4, there is a finite 
negative contribution to (71) from the region of very 
large rio (i.e., the region where p2(r1,72)/po has reached 
its asymptotic value, which is not exactly equal to po). 
Since (71) is rigorously true, the integral 


4a f f [p(r)—pulrtar, 


which represents the contribution to the left side of 
(71) from the region where ri. is not very large, must 
be greater than —1; thus we arrive at (69) instead of 
(70). 

The slight lowering of the density at infinity when an 
atom is known to be at the origin is not hard to under- 
stand, since the localization of one atom decreases by 
one the number of atoms eligible to occupy the site at 
infinity. In the classical perfect gas pox 7xr= 1; since the 
atoms are independent, the localization of one atom 
simply lowers the mean density by 1/V throughout the 
rest of the volume. In a real liquid, however, poxTxr—0 
as T— 0. Finally, when T=0, (70) implies that no 
influence propagates to infinity, even in order 1/N, 
when an atom is localized at the origin. In this case, a 
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density excess at the origin is surrounded by a rare- 
faction slightly further away, so that no change occurs 
in the density at infinity. 

Thus, the simple counting argument used to prove 
(70) is actually correct when T=0, bécause there is no 
change in the density far away when we localize an 
atom at the origin. For the same reason, we believe 
that any identity based on a counting argument 
becomes correct when T=0. We therefore believe in 
the truth of the identity 


p3(T1,¥2,Fs) 
f dr;| —————— po |= —2, 
r3ro,ry po (r12) 


although we cannot give a rigorous proof of it. Equa- 
tions (70) and (72) are easily combined to give Eq. (39), 
which we have used in our work (one must remember 
that, in (39), p3 is defined to include a delta function 
ON £3). 

Equation (39) is easily understood for small or large 


(72) 
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values of rio. If ri2x<2.4A, both sides of (39) are 
identically zero for all rs. If 1 and 2 are far apart, then 
p3 can be written as 


poPs(1731) pi (732) + 9078 (F32) 


and the right side becomes po’f,(r31). If 3 is far from 2, 
then both sides are equal. Hence the only contribution 
to the integral comes when 3 is near 2; but then we can 
set p1(r31) = po and we are left with 


J erapeLpatrs) +5 (132) — po], 


which vanishes as a result of (70). 

Even if (39) is not rigorously true for intermediate 
values of rye, it cannot fail badly ; for when rie is greater 
than 2.4A, but not very large, then for any fixed 
radius 732 the solid angle in which 3 interferes with 1 
is small (less than one-quarter of the total solid angle 
available to ros). 
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A measurement of the energy losses of monoenergetic neutrons scattered from liquid He II would permit 
a determination of the energy-versus-momentum relation for the elementary excitations (phonons and 
rotons) in the liquid. A major part of the scattering at a fixed angle arises from production or annihilation of a 
single excitation and appears as sharp lines in the energy spectrum. From the position of these lines the 
energy-versus-momentum relation of the excitations can be inferred. Other processes, such as production 
or annihilation of multiple excitations, contribute a continuous background, and occur at a negligible rate 
if the incident neutrons are slow (A24A) and the helium cold (T<2°K). The total cross-section data can 
be accounted for by production of single excitations; the theoretical cross section, computed from a wave 
function previously proposed to represent excitations, agrees with experiment over the entire energy range, 
within 30%. Line widths in the discrete spectrum are negligible at 1°K because of the long lifetime of 


phonons and rotons, 


I. INTRODUCTION 


HE possibility of a direct experimental determina- 
tion of the energy-versus-momentum relation for 
phonons in a solid was pointed out by Placzek and 
Van Hove.! They proposed to study the energy distribu- 
tion of very slow neutrons scattered inelastically and 
coherently from the solid; if the incident neutron beam 
is monochromatic and if the scattering process involves 
only the production or annihilation of a single phonon, 
energy and momentum conservation imply that the 
neutrons emerging at a given angle can have only 
certain discrete energies. The energy-momentum rela- 
tion for the phonons can be inferred from the angular 
variation of this discrete spectrum. Other processes, 
such as multiple phonon production or annihilation, 
contribute a continuous background above which the 
discrete spectrum is still observable. 

The purpose of the present paper is to suggest that 
the same technique be used to determine directly the 
energy-versus-momentum curve for the excitations in 
liquid helium, and to predict some details of the 
experiment. A direct measurement of this curve would 
be of considerable interest, since the shape of the curve 
has already been predicted in some detail by indirect 

* Richard C, Tolman Fellow. 

1G. Placzek and L, Van Hove, Phys, Rev. 93, 1207 (1954). 
We have recently Jearned that some of the ideas in the present 


paper have been discussed by V. V. Tolmachev, Repts. Acad. 
Sci, U.S.S.R. 101, No. 6 (1955). 


methods. Landau? argued on theoretical grounds that 
the energy E(k) of an excitation momentum hk should 
rise linearly with slope fc for small (c=speed of sound 
= 240 m/sec), pass through a maximum, drop to a 
local minimum at some value ko, and rise again when 
k>ko. For small k, the excitations are called phonons 
and may be thought of as quantized sound waves; 
the excitations with k~kp are called rotons, and seem 
to be the quantum-mechanical analog of smoke rings.?-4 
At low temperatures, only the linear portion of the curve 
and the portion near the minimum are excited; if the 
curve is represented near the minimum by E(k)=A 
+h? (k—ko)?/2p, the specific heat and second sound 
data can be fitted best with the values® 


A/x=9.6°K, kyp=2.30A7, »=0.40 mae, 
and almost as well with the values? 
A/x=9.6°K, ko=1.95 AW, p=0.77 mae. 


A Landau-type curve has recently been obtained 
from first principles by the substitution of a trial 
function into a variational principle for the energy.?4 
The resulting curve is an upper limit to the true 
spectrum, and gives A/k=11.5°K, ko=1.85 Am, p~0.20 


+L. Landau, J. Phys. (USSR) 5, 71 (1941); 11, 91 (1947) 
3?R. P, Feynman, Phys. Rev. 94, 262 (1954). 

4R. P. Feynman and M. Cohen, Phys. Rev. 102, 1189 (1956). 
§ deKlerk, Hudson, and Pellam, Phys. Rev. 93, 28 (1954). 
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a Fic. 1. A Landau- 

ra type energy-versus-mo- 

o mentum curve, with 

& A=9.6°K, n= 1.06 Me, 

ko=1.85 A. 
10 2.0 


Wove Number k(A™') 


Mye. In the rough computations of this paper we shall 
use the curve of Fig. 1, which has A/k=9.6°K, ko= 1.85 
A, 4=1.06 mye. These values represent a compromise 
between theory and experiment, and also fit the 
specific heat data. Most of our numbers have only a 
qualitative significance, since the shape of the energy 
curve between the phonon and roton regions is highly 
uncertain. 


Il. GENERAL THEORY 


Suppose the liquid is initially in state 7, and we 
bombard it with neutrons of mass m and momentum 
hk; the cross section for a process in which the liquid 
is left in one of a group of final states 7, and the neutron 
emerges with a momentum in some region G of k space, 
is given by the Born approximation®’ as 


2a? 
c=— > 


hy fer 


[Vs;(k—k,) |? 


keG 


2m 
xa(#—be+—"(8,~ B) )a (1) 


where the sum and the integral extend over the regions 
F and G, respectively. The matrix elements V are given 
by 


N 
Vesa) = f CRE o> 


inl 


Xexp(1q-re)¥s(t1,° + -tw)dri: + dry, 


where y; and yy are the wave functions for the liquid 
in states 7 and /. The scattering length a is independent 


90° 


38 0 |, 
5 ea 3 


Fic, 2. Kinematics of production of single excitations. The 
curves give the wave number ky of the exit neutron, as a function 
of k; and @; ky is the distance from the origin. On "each curve ky 
is constant and has the value given on the 6=0 axis. Note that 
kp= hi when 6=0 (and ky > 0.68 A7). 


®G. Placzek, Phys. Rev. 86, 377 (1952). 

7 For the justification of the use of the Born approximation in 
this problem see E. Fermi, Ricerca sci. 7, 13 (1936); G. Breit, 
Phys. Rev. 71, 215 (1947). 
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of energy at low energies, and is related to the total 
cross section o’ of a bound He nucleus by o’=4ra’. 
McReynolds? found o’=1.1+0.15 barns. 

In elastic scattering, the final state of the liquid is 
the same as the initial state.? In the matrix element for 
elastic scattering, the integral of y7 over all coordinates 
but one is equal to 1/V (V=volume of the liquid) 
except at points very close to the surface. Only the 
region near the surface contributes to the volume 
integral of exp(zq-r), and hence the only elastic scatter- 
ing from the liquid is diffraction from the surface.” 
In a crystal, the integral of y over all coordinates but 
one gives a function which is strongly peaked at the 
lattice points; hence for certain directions of q the 
matrix element V;;(q) becomes proportional to JN, 
and elastic scattering occurs. Therefore, although 
neutrons are scattered elastically from solids, virtually 
no elastic scattering should occur from the liquid. 


III. SCATTERING AT ZERO TEMPERATURE 


If the liquid is at zero temperature, then the initial 
state is the ground state Wo, and the neutron must lose 
energy in the scattering. The simplest process which 
can occur is the creation of a single excitation of 
momentum A(k;—k,;) in the liquid (if we impose 
periodic boundary conditions on the liquid, the station- 
ary states may be taken as momentum eigenstates). 
Energy conservation requires 


Wh?/2m= Whe /2m+ E(|ki—k;). (2) 


If we fix k; and the angle @ between k; and k,, then if 
k;> 0.68 A! (A;<9.25 A) there is a unique ky for each 
6. When &; is just less than 0.68 A7!, (2) becomes 
insoluble for ¢> 90°. As &; decreases further, the region 
of solubility of (2) is a cone of decreasing aperture 
about the forward direction. For each direction @ in 
the cone there are two solutions for &,. Finally, when 
k;<0.38 A“, (2) becomes insoluble at any angle. 
The qualitative behavior of the solutions of (2) is 
shown in Fig. 2, but not too much significance should 
be attached to the numbers, which are based on the 
uncertain curve of Fig. 1. 

The solutions of (2) should appear as lines in the 


8A. W. McReynolds, Phys. Rev. 84, 969 (1951). 

® This is true if the liquid is confined in a fixed box. If the box 
is free to recoil, then for elastic scattering the final state is the 
same as the initial state except for a translational motion of the 
whole with momentum #%(k;—ky) and infinitesimal energy 
#? (Ki—ky/2Nmye. The wave function for the final state is then 
v=; exp[iN(K;—ky): ZriJ. The remaining arguments are 
still valid, with each r; being measured from the center of mass 
rather than from an origin determined by the location of the box. 

10 By “elastic scattering” we mean scattering processes in which 
the incident and exit neutrons have the same energy, and further- 
more the state of the liquid is unchanged. If we relax the latter 
requirement, then at finite temperature some neutrons can 
scatter without energy loss by colliding with excitations in the 
liquid. However, neutrons which are scattered by collisions with 
excitations emerge with a continuous distribution of energies 
(i.e., the number in any energy range dE is proportional to dE), 
and there will not be any “group” of elastically scattered neutrons. 
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energy spectrum of the neutrons emerging at a given 
angle. The energy-versus-momentum curve for the 
excitations can be obtained from Eq. (2) by measuring 
ky as a function of angle for fixed k;, or by looking at a 
fixed exit angle and varying &,;. Processes involving 
multiple excitations contribute a continuous back~ 
ground, When k;<0,38 A-!, the right side of (2) is 
bigger than the left for any ky; furthermore, since E(z) 
is the energy of the lowest state of the liquid having 
momentum fk, production of multiple excitations 
will also be impossible when &;<0.38 A~. Hence, when 
the liquid is at zero temperature, neutrons with k;<0.38 
A! (A>16.5 A) should pass through with no scattering. 
This conclusion seems consistent with the data of 
Sommers, Dash, and Goldstein"! on the transmission 
of neutrons by He. 

The strengths of the lines are given by (1). We take 
the region G of k space as k’AkdQ, where Ak includes 
the line under study. Momentum conservation man- 
ifests itself in the vanishing of Vyo(k—k,) unless the 
state f has the momentum A(k;—k). Integration of 
(1) over G gives 


_ 2 
doy sae | Vyo(Ky—k,) | *d@ 
dQ k; m E'(\k;— 


“( 
2 ek Fe at 


as the cross section for scattering into dQ with the 
production of one excitation of momentum h(k:—ky,) 
(and final neutron momentum hky). In reference 3 


the function 
Va= I Nod) exp(ik-r,) (4) 


is proposed to represent a Single excitation of momentum 
hk. This function is exact for very small z, and gives an 
energy spectrum qualitatively similar to Landau’s, 
but with A twice too large. Normalization requires 
§=NS(k), where S(k) is the Fourier transform of the 
zero-temperature radial distribution function p(r), 


S()= f exp (ik-r) p(r)dr. 


The resulting matrix element is 
| Vyo(a) |? =NS(Q)- (5) 


Actually, (5) is an over-estimate, as one can see from 
the exact sum rule 


Ds! Vso(a) |?= CV (a) V*(q) Joo= NS(Q)- (6) 


If (5) were exact, then (6) would imply that production 
of multiple excitations is impossible. A more accurate 
wave function for an excitation is given in reference 4, 
and leads to the matrix elements given in Fig. 3. In 
the roton region, these matrix elements are only ten 
to fifteen percent smaller than those given by (5), and 


™ Sommers, Dash, and Goldstein, Phys. Rev. 97, 855 (1955). 


to} 
a 
Fic. 3, Matrix elements = 
for production of singleex- +> 
citations, computed from 7 05} 
= 


the wave functions of ref- 
erence 4, 


lo 2.0 
kar!) 


we infer that the most likely way for a neutron to lose 
a given amount of momentum is through the production 
of a single excitation. 

Substituting the matrix elements of Fig. 3 into Eq. 
(3), we obtain the curves of Fig. 4, giving the angular 
variation of line strength for different values of &;. 
The curves are given in units of a, which is the differen- 
tial cross section per unit solid angle for scattering 
from a bound helium nucleus. When &,>0.68 AW, 
do;/dQ vanishes at 6=0 because the matrix element 
Vyo approaches zero when the momentum transfer is 
small. When &;<0.68 Aq, there are two curves for 
each kj, corresponding to the two lines which are 
observed at each angle within the cone of solubility of 
(2). At the edges of this cone, doi/dQ becomes infinite 
because the denominator of (3) vanishes. The total 
cross section, however, is finite. 

If we neglect the possibility of producing multiple 
excitations, the total cross section at zero temperature 
is obtained by integrating (3) over angles. The resulting 
cross sections are compared in Fig. 5 with the total 
cross sections measured" at 1.25°K (the temperature 
effect, which is negligible, is discussed in the next 
section). The agreement of theory and experiment 
within 30% is quite satisfactory in view of our in- 
complete knowledge of the wave function yy, and the 
curve E(k). When k, is large, Eq. (1) can be shown to 
lead to the total cross section (16/25)4ra2N, which is 
just the cross section for free helium nuclei scattering 
incoherently. The theoretical curve has been extra- 


AL FIO.SA 


30° 60° 390° 


120° 150° 180° 


Fic. 4. Angular distribution of neutrons which have been 
scattered by the process of producing a single excitation, at zero 
temperature, 
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Fic. 5. The broken 
line is the total cross 
section computed here. 
Circles are computed 
points. The solid line 
represents the measure- 
ments of Sommers ef al. 
at 1.25°K. 


diCAD 


polated to this value. As \ decreases, the theoretical 
curve rises faster than the experimental one; the reason 
is probably that the matrix elements of Fig. 3 are too 
large when the momentum transfer is in the roton region. 
To see this more clearly, we note that there is an exact 
sum rule, 


Ls | Voy (k) | (Ey—- Ey) = [v* (k) (2— Eo) V (k) Joo 
=NPE/2m, (7) 


which, in conjunction with Eq. (6), says that the 
“average” energy loss associated with momentum 
transfer & is #?k?/2mS(k). For small &, this “average” 
energy loss is the same as E(k); hence the idea that 
multiple excitations are produced with negligible 
probability is correct. However, when k=1.85 A™, one 
finds that the “average” energy loss is 19.5°K, which is 
twice the size of E(z). If the matrix elements of Fig. 3 
are correct, then the sum rule (6) implies that there 
is only a 13% probability of producing multiple 
excitations when the momentum transfer is 1.85 A7. 
Such a small probability of multiple excitation seems 
hardly consistent with a mean energy loss corresponding 
to the production of two rotons, We conclude that the 
correct matrix elements for single roton production are 
almost certainly smaller that those given in Fig. 3.” 


IV. EFFECTS OF FINITE TEMPERATURE 


If the liquid is at a finite temperature, some phonons 
and rotons are always present, and the neutrons can 
gain energy by annihilating excitations. The energy 
spectrum of the neutrons emerging at a given angle will 
contain a line representing annihilation of single 
excitations as well as a line or lines arising from their 
production. At temperatures below the A point, the 
production process is far more important than annihila- 
tion if the wavelength of the incident neutrons is less 
than 10 A. Since phonons and rotons obey Bose 
statistics, the rate of annihilation of excitations of 
momentum hk is proportional to the number (k) of 


It is not entirely obvious that the total cross section is 
lowered by lowering the probability of producing single excitations, 
since the sum rule (6) implies that the probability of producing 
multiple excitations must be correspondingly increased. However, 
the density-of-states factor resulting from the delta function in 
Eq. (1) produces a decrease in the total cross section when 
probability is transferred from single to multiple excitations. 
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such excitations already present, while the rate of 
production is proportional to ~(k)+1. At temperature 
T, we have n(k)={expLE(k)/«cT]—1}-'; for rotons 
at 2°, E(k)/«xTo<5, and we see that the “spontaneous 
production” factor 1 is much greater than n(k). 
Figure 2 shows that if the incident neutron wavelength 
is less than 10A (&; 20.6 A™), most of the excitations 
produced have wavé numbers greater than 0.4 A7! 
and consequently energies large compared to 2°K. 
Furthermore, Fig. 3 shows that the matrix elements 
for the production of low-energy phonons are small. 
Thus we see that in the range T S2°K,d S10 A, 
spontaneous production is much more important than 
‘Gnduced production” and annihilation, and the lack 
of temperature dependence of the total cross section is 
understood, 

For incident neutrons of wavelength greater than 
~10 A, it is kinematically impossible to produce any 
except very low-energy phonons (and no excitations at 
all can be produced when A>16.5 A). Hence the 
annihilation process is the most important one at long 
neutron wavelengths, and the total cross section in 
this region is strongly temperature-dependent. Ul- 
timately, at very long incident neutron wavelengths, 
the kinematics becomes that of zero-energy incident 
neutrons. Figure 2 shows that a zero-energy incident 
neutron can annihilate only phonons with wave number 
k=0.68 A7! and energy 11°K." The total cross section 
ultimately depends on the temperature as exp(~11/T), 
and on the incident velocity as 1/» [arising from the 
factor 1/k; in Eq. (1)]. 


V. RESOLUTION AND LINE WIDTH 


In order to obtain even a moderately accurate 
measurement of the roton energy A, the velocities of 
the incident and exit neutrons must be known very 
accurately. Figure 2 shows that the slowest neutron 
which can produce a minimum energy roton has a wave 
number k~1.04 A-! (energy=25°); the exit neutron 
in this case has k~0.81 Aq! (energy = 15°). To measure 
A with an accuracy of one degree, the neutron energies 
must be accurate to 0.7°; for the incident neutrons, we 
need 5\/A=5E/2E=0.7/50=0.014. Thus, to measure 
A with ten percent accuracy, the velocity spread of the 
incident neutron beam must be limited to about one 
percent. Nothing is gained by studying neutrons which 
have annihilated a roton; the slowest incident neutron 
which can annihilate a roton has k-~0.81 A-!, and we 
need 6\/A=0.7/30=0.023. The slight improvement in 
the resolution situation is far more than offset by the 
low rate of annihilation, as compared with production 
(see Sec. IV). The resolution situation is best when 
we observe neutrons scattered through 180°. If we study 


This is not entirely correct, since annihilation of multiple 
excitations is possible, though unlikely at low temperatures. 
The total momentum of the excitations annihilated must be at 
least 0.68 A7 and the total energy at least 11°K. At low tempera- 
tures and velocities, the cross section still varies as! exp(—11/T). 
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the 90° scattering, the slowest allowed incident neutron 
has k~1.5 Aq! and we need 6A/A~0.007. 

If one looks at the energy distribution of the neutrons 
emerging at a particular angle, how broad is the line 
corresponding to those neutrons which have created a 
roton? We have studied this question in some detail; 
in the Appendix we compute the detailed line shape 
which would result if we make certain assumptions 
about the interaction between phonons and rotons. 
The assumptions prove to be unrealistic, but the method 
of computation is of some interest. The result which we 
obtain for the line width is what one would expect 
from the uncertainty principle; the width is h/r, 
where 7 is the lifetime of the roton until it collides with 
something. In our model, roton-roton interactions are 
neglected; hence the lifetime we compute is that for 
roton-phonon collisions. This lifetime is very long, and 
the resulting width is less than 10-®°K (to be compared 
with a roton energy A=9.6°K) when the helium is at a 
temperature of 1°K. Landau and Khalatnikov'* have 
computed the lifetimes for phonon-phonon, phonon- 
roton, and roton-roton collisions. They find that at 
temperatures of 1° and higher, the roton-roton lifetime 
is much shorter than the roton-phonon lifetime. 
When T=1°, the width of a roton line is 0.006°K, 
which is still very small compared with the roton 
energy, but large compared with 10-®°K. The width 
is proportional to Ttexp(—A/«T), which represents 
the temperature dependence of the number of rotons 
present; when T= 2°K, the width is about 1°. Similarly, 
one can compute the width of a line arising from the 
production of phonons by neutrons; when T=1°K, 
the lifetimes of a phonon for scattering by a roton or by 
another phonon are comparable, both giving rise to 
widths of about 10-°°K. We conclude that for all 
practical purposes the lines in the neutron spectrum 
will be true delta functions if the helium temperature is 
near 1°K. 

To calculate the cross section for roton-roton colli- 
sions, Landau and Khalatnikov assume a delta-function 
interaction between rotons, the strength of the interac- 
tion being chosen to fit viscosity data. In the appendix 
we show that such a delta-function potential, with a 
strength close to that of Landau and Khalatnikov, 
arises from the possibility that roton I can emit a 
phonon which is subsequently absorbed by roton IT. 
The result is only suggestive rather than conclusive, 
however, since we are also led to a velocity-dependent 
interaction between rotons, comparable in strength with 
the delta function. 
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APPENDIX 


The problem of the breadth and shape of the lines 
in the neutron spectrum caused us some confusion, the 
details of which are not worth recounting. Finally, we 
constructed a “model” Hamiltonian for helium, 
including a phonon-roton interaction, for which the 
line shape can be computed very accurately. Analysis 
of this Hamiltonian not only resolved our private 
confusion, but also showed what the line shape is in 
the case of real helium. The important features of the 
answer can be obtained from perturbation theory, 
provided certain linear terms are interpreted as the 
beginning of exponentials. We regard the more accurate 
computation as sufficiently interesting to be presented 
here. It is analogous to the Weisskopf-Wigner method 
in the theory of optical spectra. 

Suppose, for simplicity, that excitations with k<k, 
are phonons, with energy E(k)=hck, and excitations 
with k>k, are rotons with energy E(k) =A+h?(k—ko)?/ 
2u. If there were no interaction between phonons and 
rotons, the Hamiltonian for the liquid would be 
H=)oa,*0,E(k), where a,* and a, are the usual 
creation and destruction operators for excitations of 
momentum hk; the operator ax*a, has integral eigen- 
values x, which represent the number of excitations 
present with momentum fk. The matter density at 
ris given by 


p(t)— po= (poh /2 Ve hi 


<ke 
X [ex exp (tk -r)+-a,* exp(—ik-r)]. 


We are interested only in the average behavior of the 
matter density over a region of finite size (the size of 
the roton) and have therefore omitted wavelengths 
smaller than 27/k, in the representation of p(r). 
We assume that if a roton of momentum Ak is at r, 
its energy is E(k)+(04/dp)[p(r)— po]. Atkins and 
Edwards'® have measured 0A/dp and find dA/dp 
=—0.574/p. The actual interaction between phonons 
and rotons involves further coupling terms, which will 
be discussed later. 

It is convenient to use a mixed representation in 
which phonons are treated with creation and destruction 
operators, and rotons are represented as particles with 
coordinates and momenta. If the number of rotons 
present is m, the Hamiltonian is 


™ 0A ™m™ 
H=> akahckt+ O B(ks)-+—=(o0h/2V0) > 
om] 


k<ke i p il 


xX DX Alex exp(tk-r,)+-a,* exp(—tk-r,)]. 


k<ke 


() 


The interaction term in (1’) creates and destroys single 
phonons, since the operators a, and a,* appear linearly. 


1K, R. Atkins and M. H. Edwards, Phys. Rev. 97, 1429 
(1955). 
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(a) * (b) 


Fic. 6. Diagrams representing the phonon-roton interaction in 
the Hamiltonian (1’). The solid line is a roton, the broken line a 
phonon. Time increases from left to right. Diagram (a) represents 
emission of a phonon, and (b) represents absorption. 


Rotons only change their momenta, however, since 
the interaction merely multiplies the roton wave 
function by a plane wave. If we represent rotons by 
solid lines and phonons by dotted lines, the two terms 
in the phonon-roton interaction can be represented by 
the diagrams of Fig. 6. 

If the number of neutrons per second emerging in 
solid angle dQ with an energy loss in the range (E, 
E+dE) is n(E)dEdQ, the Fourier transform of »(E) 
is given by Eq. (1) as 


f(n)= f exp(—#E)n(E)dE 


2ah 
2 (—) [eax exl—ie)-2)] 
ig : 2m 
x | Vys(k~k;) o( #- ke+—"(By-E)). 


Since the energy needed to produce a roton is small 
compared with the energy of the incident neutrons, the 
term (E;—£,) in the argument of the delta function 
can be ignored with negligible error. This approximation 
gets better as the incident neutrons get faster.'® 
Similarly, since the direction of k is fixed and its length 
is very close to k;, we can replace k—k; by a constant 
vector q, where |q|=22;sin(6/2). Furthermore, the 
liquid may be in different initial states 7 with probabil- 
ity exp(—6E;)/D exp(—GE,), where B=1/«T. Thus 
we obtain 


(0) = (@hki/m)LX exp(— BE,) J u 


Xexp{— [BE;+tn(Ey— Es) J} | Vr5(9) |? 
= (a’hk,/m)[Tr exp(— 6H) J" 
XTrLV exp(—inH) V* exp(—6H+inH)]. (2’) 


In Eq. (4) we have suggested that the wave function 
for a single roton of momentum q is the ground-state 
wave function multiplied by > exp(éq-r,). Since the 
wave function for an oscillator in its first excited state is 
just the ground-state wave function multiplied by the 
normal coordinate of the oscillator, it would follow 
that > exp(iq-r,) is the normal coordinate for rotons 
of momentum q. In particular, if | 7) is a state contain- 

18 We are only computing the shape of n(£) for energies near 
the roton sl ah Fast neutrons tend also to produce multiple 


excitations, with large energy losses, but we are not studying that 
part of the spectrum. 


ing no rotons of momentum q, then 


V|7)=d exp(iq-r.)! 7) 
=[NS(q)/V}$ exp(iq:r)| 7). (3’) 


The wave function exp(tq-r)| 7) represents whatever 
was present in the initial state 7, plus a roton of momen- 
tum q;r is the position coordinate of the roton. If | 7) 
already contains rotons of momentum gq, then 
> exp(tq-r;) both creates and destroys rotons. 
Actually, 3° exp(tq-r:) is not the exact normal co- 
ordinate for a roton, and consequently this factor 
can also create and destroy multiple excitations. In 
keeping with the spirit of this computation, we deal 
with a fictitious model in which direct production of 
multiple excitations by neutrons does not occur. 
This does not mean that no multiple excitations are 
produced; a neutron can produce a single virtual roton, 
which then breaks up into a real roton and a real 
phonon through the interaction term in H. 

As a further simplification, we deal with a case 
slightly different from thermodynamic equilibrium. 
In the initial states 7 we allow an arbitrary number of 
phonons to be present, with the usual thermodynamic 
distribution; however, we consider only initial states 
in which no rotons are present. This picture is accurate 
at low temperatures. Consequently, for the initial 
states 7 the Hamiltonian is 


Ho= X ax*axhck, 


k<ke 


(4’) 


and for the final states f 
H= Hot E(k)+0A/dp(poh/2Vc)* Y kt 


k<ke 


X [ax exp (tk -r)+-a.* exp(—ik-r)]  (5’) 
=H,+H. 


In determining the traces of (2’) we can use any states 
as a basis. For the initial states we use eigenstates of 
Ho, denoted by |i%x2":+); as base vectors for the 
states with a roton present we use products of an 
eigenstate of H, and a position eigenfunction (delta 
function) for the roton, denoted by |x; 1%%o--*). 
Since we deal only with initial states with no rotons, 


[Tr exp(— 6H) J? = pal [1—exp(— phck) ]. 


For the other trace we obtain 
TrLV exp(—inH) V* exp(—6H+inH)] 


= 2 dxdy (mitxe:++|V | y 5 mero: ++) 
Nil M2 «+ 
XY Mere > - jexpl —in(Ho+H1) }] xjmrte- -+) 
X 3 meatus: - + | V*) mets: ++) 


xexp[(—8+ in) mhck]}. (6’) 


THEORY OF 


From (3’) it follows that 


(MiciMne'** [Vl ¥5 mimes * -)=[N'S(q)/V }texp(éa-y), 
(x5 Metco + |V*| erties +) =CNS(9)/V Pexp(—tq:x). 


A great simplification is effected if we neglect the 
dependence of E(k) on k, ie., let E(k)=A. We now 
make this approximation and shall later consider the 
effects of restoring the dependence. Since H does not 
involve the momentum of the roton, if the wave function 
is initially a position eigenfunction of the roton it will 
remain a position eigenfunction. Consequently, 


{y 5 Mertue’ ++ | expl—in(Ho+ Hi) J|x; mertu2* +) 
=6(x—y) (x; Meir? ** | 
exp[—in(Ho+H1)]|X3 meres +). (7') 


The second factor on the right is simply a diagonal 
element of the Green’s function (represented in 
occupation-number space) for a collection of oscil- 
lators forced by the function H;. The matrix elements 
Ginn for a forced oscillator are easily worked out, by 
operator calculus" or other methods. If the Hamiltonian 
for a forced oscillator is 


H=ar*actg(t)at+g* (i)a*, (8’) 


and |m) and |») are eigenstates of the unforced 
oscillator with energies me and ne, respectively, then 


Gnn= (m ex{ if 11(pat] 2) 


=exp(iemt’—tent”)(m! n!)X (") (): ! 
ae r 


X (—4B*)"—-"(—4B)"""Geo,  (9’) 
where a 
B= f g(t) exp(—4el)dt, 
and : 
Guo=exp( - ff dtdsg(der(sje™"™? ). 
tt >t>adt! 
The trace is now easily obtained. We find 
X expl(—6-+it”— it!) en Gan 
Ben *) n—r 
ao 0 riC(n— a ee 
=Goo(1—e7**) exp[ — BB*/(e*— 1) ]. (10’) 


From (5’) we have g(t)=(@A/dp) (poh/2V.)tete™"*, 
Replacing the sum over oscillators by V(27)-?/dk, 


17 See, for instance, R. P. Feynman, Phys. Rev. 84, 108 (1951), 
Eq. (38). The present case is a trivial generalization of the result 
given there. 
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(a) (b) (c) 
Fic. 7. Interaction of neutron (double line) with helium. In 
(a) the neutron produces a real roton. In (b) the neutron produces 
a virtual roton which decays into a real roton plus a phonon of 


frequency w. In (c) the virtual roton absorbs a phonon and 
becomes a rea] roton. 


we find [letting a= (0A/dp) (poh /c)*] 


f(a) = (@hks/m)N S(q) exp(—tAn) 


ao hicke 
ee ee ee ee du) rey 1? 
exp| a | ov 


Aes sin® (won /2) 
eras as 


—w(er#—1)-+ 


The various terms in the exponent of (11’) are easily 
understood by applying perturbation theory to (5’), 
treating a as small. The coefficient of —éy is simply 
the energy A of a roton, plus a correction arising from 
the fact that the roton can emit and reabsorb, or 
absorb and re-emit, phonons. The rate of emission and 
reabsorption of phonons of momentum hk is propor- 
tional to +1, while the rate of absorption and re-emis- 
sion is proportional to , with an energy denominator 
of equal magnitude but opposite sign. Hence, the 
energy correction is independent of the number of 
phonons present, and does not depend on the tempera- 
ture. The numerical value of the self-energy is 5E/A 
=—0.04 23, with &. measured in reciprocal angstroms. 
The cutoff k. should correspond to a wavelength equal 
to the roton size, i.e., several interatomic spacings; 
we estimate k-~0.5 At. 

The remaining terms in the exponent represent the 
possibility of production of multiple excitations 
[Fig. 7(b)], or production of a roton which then 
absorbs a phonon [Fig. 7(c)]. In all diagrams the 
neutron (double line) interacts with the liquid only 
once, and produces a roton. The roton may then emit 
or absorb an arbitrary number of phonons, each 
roton-phonon interaction contributing a factor a 
to the amplitude and a” to the probability. The only 
processes with rates proportional toa” are those shown in 
Figs. 7(b) and 7(c). In 7(b) the neutron suffers an 
energy loss A+w, while in 7(c) the energy loss is 
A—w. The rate of 7(b) is proportional to »(w)+1, 
while 7(c) is proportional to ~(w). Hence at low tempera- 
tures the line shape is strongly asymmetric. If the 
exponent of (11’) is expanded into complex exponentials, 
the coefficient of exp(—iwn) is the rate of 7(b) as 
computed in perturbation theory, and the coefficient 
of exp(iwn) is the rate of 7(c). The remaining term, 
which is independent of y, represents a change in the 
rate of single roton production 7(a) arising from the 
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n(E) n{E) ki kee wk ke. 
T=0 T>0 - - SS 
Fic. 9, Diagrams representing phonon-roton scattering. 
2 c the different possible outcomes can interfere with each 
ray 4 


Fic, 8, Shape of a line in the neutron spectrum resulting from 
roton production, computed from (11’), In the actual case, the 
delta-functions (represented by arrows) are slightly smearedout, 
but the rest of the shape is as shown, 


distortion of the single-roton wave function by virtual 
phonons, 

Since @ is small, the exponential in (11’) can be 
accurately replaced by the first two terms of a power 
series. Then (£) is just the coefficient of exp(—tEn), 
and we can plot the line form (Fig. 8). If only the first 
two terms of the power series were retained, »(£) would 
cut off sharply at energies more than ick, from the line 
center; the higher terms in the power series smear out 
the cutoff. The one-sidedness of the curve for T=0 
arises from the fact that the roton is the lowest excita- 
tion of momentum q. Since no annihilation is possible 
at zero temperature, the neutron cannot lose any less 
energy than that needed to produce a roton. 

At high temperatures (Ghck.<«1) the line becomes 
Gaussian with width 

z 1 /0A\?pok.3 
t= (E-BY w= —(—) =. 
6r?\ dp] Be 


It is readily shown that in thermal equilibrium the 
matter density fluctuates according to the Gaussian 
distribution. The width of the Gaussian at high 
temperatures is pok?/6m’Gc®. Reasoning classically, one 
might say that although we do not know the value of 
the density at the place where the roton is created, 
nevertheless the density has an instantaneous value, 
which determines the amount of energy needed to 
create the roton. Accordingly, the line shape would be 
the same as the shape of the statistical distribution of 
the density fluctuations, as is indeed the case at high 
temperatures. 

The classical argument fails at low temperatures, 
especially near the line center. For large values of 7, 
sin?(wn/2) may be replaced by its average value. 
The integral {dw wexp(—iwy) approaches zero 
because of the oscillation of the exponential, and we 
find that f(»)~constXexp[—i(A+5E)y]. Therefore 
n(E) contains a delta function at the center of the line 
(the strength of the delta function approaches zero for 
large T). To understand this classically, we would have 
to say that there is a finite chance that the density is 
exactly equal to po at the place where the roton is 
created; this, of course, is wrong. Since the density 
operator does not commute with the Hamiltonian, a 
density measurement would change the state of the 
system. Consequently, as in the double-slit experiment, 


(12’) 


other. Amplitudes, rather than probabilities must be 
added. Part of our uncertainty about the density comes 
from the fact that we do not know what state the liquid 
is in, because the temperature is finite; and part of the 
uncertainty is the quantum-mechanical uncertainty 
which still exists when the system is in a pure state. 
The former is correctly analyzable by classical reason- 
ing, and the latter is not. Accordingly, one might say 
that a finite fraction of the density fluctuations is 
congenitally unobservable; this fraction gives rise to 
the delta function. 

The uncertainty principle says that the width of a 
line arising from roton production is inversely propor- 
tional to the lifetime of the roton. Since there is nothing 
in the Hamiltonian (5’) which would allow the roton to 
disintegrate,!® the “lifetime” is the time till the roton is 
scattered by a phonon. The two diagrams of Fig. 9 
contribute to the scattering. If all rotons have the same 
energy, then energy conservation requires k1= 2. Then 
the energy denominators for the two diagrams are equal 
in magnitude and opposite in sign, and the scattering 
rate is zero. This result also holds true in higher orders 
and is well known in meson theory. Hence the lifetime 
of the roton is infinite, and the line width zero. 

We expect the delta function to spread out if we can 
analyze the Hamiltonian (5’), including the momentum 
dependence of E(k). The width, presumably, will be 
the rate of scattering of rotons by phonons. We 
thought it worthwhile to extend our analysis to include 
this case, since it is not entirely obvious that the roton 
has a finite lifetime, in the sense required by the 
uncertainty principle, simply because it can scatter. 
Furthermore, the occurrence of delta functions is not 
clearly precluded by the uncertainty principle; a line 
form consisting of a delta function super-imposed on 
the center of (say) a Gaussian would have a finite 
energy spread. Therefore we continue the analysis. 

The Lagrangian form of quantum mechanics” is 
useful here. In reference 20 the problem of a particle 
interacting with an oscillator has been studied. Suppose 
the total Hamiltonian is 


H=Hyen+ atac+ g(X,t)a-+ g*(x,t)0* =Hyant+ A’, 


18 A roton cannot emit a real phonon, since the roton is the 
lowest state of given momentum, 

18 Another way to see that the delta function must spread out 
is to note that 2(q,£) is the Fourier transform in space and time 
of the “time-dependent pair distribution function” p(r,t), which 
is the probability of finding an atom at r at time #, if there was an 
atom at the origin at i=0 [see L. Van Hove, Phys. Rev. 95, 249 
(1954). If n(q,Z) contained a delta function for some E, it would 
follow that fp(r,t) exp(iq-r)dr does not approach zero for large 
t. But p(r,t) clearly becomes independent of r and # for large 1, 
and the integral must approach zero, 

2 R, P, Feynman, Revs. Modern Phys, 20, 367 (1948), 
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where x is the particle coordinate, and Hart is derivable 
from a Lagrangian L. Then the amplitude for the 
oscillator to go from state ” at time /’ to state m at time 
i’, while the particle goes from x to y, is given by the 
sum over all paths x(t) of the functional 


ew(if- Lx(o3ee) (m exp (—if Boa 


where the sum is taken only over paths such that x(t’) 
=x and x(t’”)=y. This sum is denoted by 


f Dx(t). 


x(t’) =x 
x0 =y 


H’ depends on the path x(é) through the forcing function 
g. For any particular path, however, the matrix element 
is given by (9’). Hence, if we save the sum over paths 
and the integration on x and y till the end, the oscillator 
sums in (6’) can be carried out as before, and we obtain”! 


hk 
je-= (—)xs@ fer» expCia- 9] 


x f Dx(f) exp( if 205001) 


x(0)=x 
X(7) =y ; 
Xexp{—vA[x()]}, (13) 
where 
A[x(t)]= f dkk f f aids em ernsenmtns 
O<a<cl<n 
1 ei (ts) 
x (1+) tee], sy 
efe—] ef — J 


“GG 


The Hamiltonian A+(p—0)?/2% comes from the 
Lagrangian L= —A+43y|dx/dt|?+ |dx/dt] po. Asp ™, 
L becomes very large for all paths except the one with 
dx/dit=0. Consequently the main contribution to (13’) 
comes from paths whose end points x and y are very 
close to each other; furthermore, the entire path must 
stay close to x and y. If y» is actually infinite, x.—x, 
may be set equal to zero in (14’), and then (13’) 
reduces to (11’), which we henceforth call /,(m). 
The actual value of » is large; consequently, only for 
large y can the paths stray far enough from the origin 
to make f(y) appreciably different from f,(7). Hence 
the line form is the same as that previously computed, 
except near the line center. 


4 We have replaced q by —q. This clearly does not affect f(y). 


For any functional M[x(¢)], we define 
(M)=exp[iE(@)n) fd y—x) explia-(x—-¥)] 
x f Dx(1)M[x(t)] exp(é f ‘Iso ¥!). 


x(0)=x 
x(4) =y 


The amplitude for a free particle to go from x to y 
in time 7 is 

qt 
Kotxym= f ox em(éf z0xo¥1) 

0 


x(0)=x 
x(7) =y 


(15’) 
= (2n)-* f dk expfifk- (yx) — En}, 
and therefore (1)=1. The operation ( ), may be 
regarded as a kind of average. We want to find 
exp[—4E(g)n ](exp(—A)). 


If we define (exp(—yA))=exp(—¢), then ¢ can be 
expanded as a power series in y: 


g=yeity ert +, 


gi= (A), —g2=$((A*)—(A)*), etc. (16’) 


If the integrations on k, t, and s are postponed, then 
the kind of integral which must be done in evaluating 
(A) is 


where 


I= faty—xjene f Dx (te (era) e-twlt—-2) 


x(0) =x 
xr@)=y 
xex(é f L[x() er) 
it} 
= Heo ff Paydssine 
x f Dx(r7) exo(if Lar ee 
t 


x(t) =x: 
x(7) =y 


x f Dx(r) ex(if tire 
x(s)=x1 *. 
x f Dx(7) exp(i ‘f Lar ee x 


x(#) =x: 
7(03 =x 


Gye x: 


= ff fte—x)ae-xyd(ar—2pern os 


X Koly,xe, 9— tee Ha-) -G2-a) 
X Ko(Xe,X1, t— s)e~ 4 @-™ K 9 (x1,x,5) 
=exp{—iLE(q)(1— 4) 
+(E(q—k)+o)(t—s)+E(q)sJ}. 
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(a) (b) 


Fic. 10. Diagrams representing the two terms in (A). 


This integral clearly describes the emission and re- 
absorption of a phonon of momentum Ak [Fig. 10(a)]. 
The second term in A, corresponding to absorption 
and re-emission [Fig. 10(b)_], results in the integral 


I'=exp{—iLE(q) (9-4) 
+(E(q+k)—«) (t—s)+£E(q)s]}. 


Similarly, (A?) gives rise to terms involving two 
phonons. To calculate (A"), one must evaluate integrals 
of the form 


J,= f cs fh dts -dtyexp(Zaits) 


OE ta <ty<y 


Defining new variables x1=t1, %2=te—t, «++, ty=ty 


—t)1, and introducing the function 


1,0<y<n 
fm= 


0,9>n 


t pr? sem*r—1 
=— ( eins, 
21d.» Ss 


FEYNMAN 


we find 


ro ‘ fsx exp(tLeaidx,---dx, 


t 20 a | oO rc) 
=— ds- f Sid f 
2rd 5 0 ) 


explid (@:t+s)x]da1- + -dx, 


1 oo te | y 1 
great a rsa 8 Creer 
Qri? Sp Ss i= \Bi+s+ie; 


To make the next to last integral converge, we have 
added a small positive imaginary part ie; to each 
@;. All the e; will be taken as different, so that all poles 
are simple, even if some of the @; are equal. Closing the 
contour in the lower half-plane, we have finally 


1 efiBi- ein | 
Ce i =|( , ) IT 
vv Bitte: iAt 


. = ves a)h we 


The integral resulting from J has the form J2, with 
6i=0, Be=E(q)—E(q—k)—w. In general, the @; are 
the energy differences between the initial state and 
various intermediate states. The time integral of J’ 
is similarly evaluated, and we find 


gil E@)—EQ@-k)—o)n— 1 ) 


; 
=I= —1 fa 1 Gore | omer perme ~ [B(@)-E(q—k)— oF} 


1 


+ 
ef — J 


In the limit of infinite roton mass, (18’) is the same as 


the exponent in (11’). This is to be expected, since if 


there is only one possible path, then (exp(—~74A)) 
=exp(—y(A)). The various terms in (18’) have the 
same significance as in (11’), and agree with the results 
of perturbation theory. None of the denominators in 
(18’) can vanish, since the roton is defined as the 
lowest state of momentum q. If we approximate 


{exp(—yA)) by exp(—7(A)) in our evaluation of 


f(n), then the delta function in n(£) still persists. 


One naturally thinks of going further with the 


series (16’) and replacing (exp(—yA)) by exp[— (7(A) 
++? 2) ]. This procedure is not obviously valid, 
however, since we are interested in f(y) for large 7. 
Suppose, for instance, that ¢» is linear in for large 7, 


Fic. 11. A diagram which 
woos jee ake contributes a term to (A?) propor- 
’ os ‘ tional to ?. This term is canceled 


by the square of (10a). 


in etl EC@)—E(atk)-tola— 1 


Sas =) | (18’) 
E(@)—E(atk)+  [E(q)—E(a+k) +0} 


but some subsequent term im the series (16’) involves 
higher powers of y. Then neglect of the subsequent 
terms would make /(m) entirely incorrect for large 7. 
However, by inspection of the diagrams which contribute 
to on, it is quite easy to see that ¢, is always linear in 
n for large y. For instance, the “bubble” in Fig. 10(a) 
contributes a term to (A) proportional to 7 because it 
can occur anywhere on the solid line. Among the 
processes contributing to (A?) is the one shown in 
Fig. 11. Since each of the bubbles can occur (almost) 
anywhere on the solid line, Fig. 11 contributes a term 
proportional to 7’; but since the two bubbles are 
independent if the separation between them is large, 
the square of (10a) cancels the term in 7”. There is a 
correction term in (A?) proportional to » because the 
two bubbles may overlap. Hence gz, is linear in 7; 
similar arguments apply to ¢,. Another way to see 
that ¢ is asymptotically linear in 7 is to observe that 
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for large 71 and 72, f(y) obeys the functional equation 
f(art+n2)S=const X f(m1) f(n2). We omit the proof. 

The only part of y’ye which interests us is the part 
proportional to 7, since the rest is negligible compared 
with y¢1. Furthermore, pieces of y?y2 proportional to 
iq represent small corrections to the self-energy and 
can be omitted. Hence we are interested only in the 
piece of ¢ (if any) which is proportional to 7 with a 
real coefficient. Such a piece would cause attenuation of 
f(n) for large y, and would imply that the delta function 
is gone, 

The diagrams of Fig. 12 (and no others in this order) 
contribute the kind of terms we are looking for. For 
(12a) we have 61:=0, 62=@.=E(q)+oi—E(qt+k,), 
83> E(q)+o:1— E(q+ki—ke)—we, and 


e-an—] 1 
—J,(12a) = (<-->. ) 
ter \o?(B3-+-t€3— 7€1) 


e(tBs-ea)n— J 1 1 
~-(——_—] (—~)+ 
Bsttes \—Bstier—ies/ \(B2—8s)” 


where the dots indicate that uninteresting terms have 
been omitted. It is possible for 6; to vanish; in fact the 
condition 63=0 states that phonon-roton scattering is 
energetically possible. Since 1/(z-+ te) =P(1/z)— id(z), 
the first term becomes” 


. 1 
=| P(—) +ina(00. 
62 Bs 


The principal value term is a correction to the self- 
energy and is omitted. Omitting uninteresting terms 
again, we find for the second term 


cos (83n)— 1+7 el _>( 1 ) ine] 
B3(B2—B3)? Bs 


7™ 77 
= ——48(63) +—8(63) +---=04+-:: 
62 B2 
The net contribution of (12a) is 
77 f 
Te ee ee 
2 


The contribution of (12b) has the same form, with 
bo=E(q)— E(q—ki)— «1, 63= E(q)+o2—E(q—kit ks) 
—w, Similarly, (12c) contributes a term (77/6284)5 (6s), 
with 6o= E(q)+w.— E(a+k,), 6s=E(q)toi.—E(ath: 
—ky)—ws, Bs=E(q)— E(q—ke)—we; (12d) contributes 


2 We are free to let « and e; approach zero in any order, so 
long as we are consistent, We let e;—>0 first. 
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Fic, 12. Diagrams contributing attenuation to f(). 


the same type of term, with 


62=E(q)—E(q—ki)—o1, 
6s= E(q)+o.— E(q+k2—ky)— 1, 
Bs=E(q)+oo— E(q+k.). 


For fixed k; and ke we add the contributions of diagrams 
(12a)—(12d) to the contributions of the corresponding 
diagrams with ki and k» exchanged. Noting that 
(Bs) = —6(—6s), we find 


—7 p= ty((A2)— (Ay) 
swf faasn(s) 


1 1 
a 
en — J E(q)+o:1— E(q+k1) 


1 2 
+] 
E(q)—w2— E(q—ke) 


hi ARID: XéCE(q+ki—kz)+w2—E(q)—o1] 


where R is just the rate of roton-phonon scattering 
(Fig. 8) as computed in perturbation theory. Finally 
we find 


Sf (n)==(a*hks/m)NS(q) expl— (iAnt+-y(A)+nhR/2)] 
=fo(n) exp(—nhR/2). (19’) 


The only appreciable change in the line shape »(£) 
is the replacement of the delta function by 


1 AR 
(a=): (20’) 
Qn \ (E—A'Y?+ PR?/4 


where A’ is the corrected roton energy. The uncertainty 
principle is evidently satisfied. 

As we remarked earlier, the real interaction between 
phonons and rotons is more complicated than the one 
we assumed. The line width is determined by the jife- 
time for roton-roton collisions, rather than roton-phonon 
collisions. Landau and Khalatnikov*" fitted the viscosity 
data by assuming a roton-roton interaction of the form 
Vod(ri—re) with Vo=0.5X10-*8 erg-cm®, The interac- 
tion term in (1’), which we call #’, allows one roton to 
emit a phonon which is absorbed by another roton. 
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In computing the equivalent potential, it is permissible 
to regard the rotons as distinguishable. If the initia} 
state is ¥;=exp[i(ki-ritke-re)] and the final state is 
v= exp{il(kitk)-rit (ke—k)-r2]}, then the ampli- 
tude to go from ¢ to fis 


H'aH'ry 


———= Vx. 
1 E;—E; 
Roton 2 can emit a phonon k which is then absorbed 
by 1, or roton 1 can emit a phonon ~k which is then 
absorbed by 2. Neglecting the dependence of roton 
energy on the momentum, we find Vy= (0A/dp)?(p0/c?). 
The equivalent potential is > Vye*-™*= (8A/0p)?(p0/c?) 
X4(riz) (actually, the delta function is smeared out 
by the high-momentum cutoff). Using the value of 
Atkins and Edwards for 0@A/dp, we find V»>=0.66 
X10-*8 erg-cm’. The p-v coupling between rotons and 
phonons (see next paragraph) gives rise to a velocity- 
dependent roton-roton interaction which seems com- 
parable in strength to the one we have computed. 
If the picture of a roton as a moving smoke ring is 
correct, then the interaction between rotons would 
depend strongly on their relative orientations, Accord- 
ingly, even though the delta-function interaction can 
be simply explained, we think the actua] roton-roton 
interaction is more complicated. 

The interaction between phonons and rotons has 
been discussed by Landau and Khalatnikov, and 


R. P. FEYNMAN 

involves other terms besides (@A/8p)(p—po). A phonon 
induces a velocity field v(r) which can be represented in 
terms of the a, and a,*. In the presence of such a field, 
the energy of a roton of momentum p is E(p)+p-v. 
Furthermore, there are terms in (p—po)?, such as 
(8?A/8p") (p— po)”, which are responsible for most of 
the phonon-roton scattering. Nevertheless, a line 
arising from roton production will still have the shape 
pictured in Fig. 8, provided the delta function is 
replaced by a “witch” of the form (20’), where R is the 
rate of roton-roton scattering. As the temperature 
approaches zero, the rate R approaches zero because no 
other excitations are present to scatter the roton., 
Furthermore, at T=0, (FE) is “one sided” because it is 
impossible to produce an excitation of momentum q 
with less energy than a roton. The curve hits the 
axis with finite slope because the rate of production 
of rotons, plus a phonon of frequency of w, is propor- 
tional to w for small w. At finite temperatures, the 
background curve intersects the ‘“‘witch” with finite 
slope on the right, and zero slope on the left,” because 
the rate of phonon production is proportional to 
wl1-£ (e&*—1)-"] while the annihilation rate is propor- 
tional to w(e®’—1)—. All these statements depend only 
on the fact that the coupling is a power series in the 
d and ay*, with each creation or destruction operator 
accompanied by a factor k. 


% This is not exactly true. Processes involving several phonons 
can give rise to a smal! finite slope on the left. 
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V. Physics of Elementary Particles 


Throughout his career, Feynman worked mainly in the area of physics denoted as “particles 
and fields,” and while his interests also roamed into wider pastures, his home base remained 
the study of the strong, electromagnetic and weak interactions of the so-called elementary 
particles. Thus it was inevitable that he would become a major player in the development of 
high energy physics, trying to explain the important experimental discoveries in cosmic rays 
of the decade following World War II, and continuing with the “particle explosion” resulting 
from the use of the great accelerators and detectors as they came on-line in the subsequent 
decades. 


V.A Weak Interactions (V-A Theory) 


Since Becquerel’s discovery of radioactivity in 1896, before the discovery of the electron, 
the proton, or the atomic nucleus, the weak interactions were observed and studied in the 
form of 6-decay. Enrico Fermi formulated a theory of this process at the end of 1933, which 
described the phenomenon and, while not perfect, gave a good measure of agreement with 
experiment. Like QED, and the atomic and nuclear theories of the time, Fermi’s four- 
fermion weak interaction theory obeyed the left-right symmetry known as parity. (The 
parity operation, switching from a left-handed to a right-handed spatial coordinate system, 
left the physics unchanged.) It was a shock, therefore, when a suspected violation of parity 
in weak interactions (such as 6-decay and particle decays) was experimentally confirmed in 
1957, leading in a short time interval to theories and experiments that became known as the 
“parity revolution.” It was a real revolution, for after this no symmetry principle was taken 
as so sacrosanct that it was assumed to be valid without experimental confirmation.’ 

Feynman played a significant role in that revolution, publicly questioning the necessity 
for left-right symmetry at the 1956 Rochester Conference,” by informally suggesting an early 
version of the V—A interaction at the 1957 Rochester Conference, and by coauthoring paper 
[41] with Murray Gell-Mann. 

Paper [41] was not without controversy as concerned its collaboration, priority, etc. 
However, compared to its competition it uniquely emphasized certain features: the heuris- 
tic value of the two-component relativistic electron equation of second order in the time 
(Kramers equation),* current-current interaction possibly mediated by heavy intermediate 
bosons (see p. 197 of [41]), and the conserved vector current.® 


1For a short discussion on the parity revolution, see Pions to Quarks, edited by L.M. Brown, M. Dresden, 
and L. Hoddeson (Cambridge: 1989), pp. 23-29. 

?The Rochester Conferences, organized originally by Robert Marshak, formed an important annual series 
of international conferences on high energy physics. At the 1956 conference, Feynman famously repeated a 
question asked of him by Martin Block, about whether parity might be violated in the decay of the K-meson. 
For his suggestion of V—A at the 1957 conference, see note 3 of [41]. 

3For various relevant accounts of the parity revolution and V—A by some of the participants, see Pions to 
Quarks (note 2), pp. 434-496 and note 88, on p. 38. See also the sections on the V—A theory in a recent 
biography of Murray Gell-Mann: George Johnson, Strange Beauty (New York: 1999). 

“See L.M. Brown, “Two-component fermion theory,” Phys. Rev. 111 (1958): 957-965. 

5V and A stand for vector and axial vector couplings, two of five interaction choices permitted by the special 
theory of relativity. V-—A denotes a particular mixture of the two interactions, which have opposite parity, 
and it is their interference term in a weak interaction process that violates parity invariance. 
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Item [65] is a set of six pedagogical lectures on the weak interactions of hadrons (i.e. 
baryons and mesons) delivered by Feynman in 1964 at the International School of Physics 
“Ettore Majorana” in Erice, Sicily. The lectures discussed the consequences in G-decay and 
weak particle decay of the conserved vector current (CVC) and partially conserved vector 
current (PCAC). Feynman treated the decay of strange particles, based on Cabibbo’s theory, 
and derived the Goldberger—Treiman relation between the strong meson—nucleon coupling 
constant and the weak coupling constant. He discussed the consequences of the Gell-Mann— 
Ne’eman symmetry SU(3) for the weak interactions of hadrons. 


Selected Papers 

[41] With M. Gell-Mann. Theory of the Fermi interaction. Phys. Rev. 109 (1958): 193-198. 
[65] Consequences of SU(3) symmetry in weak interactions. In Symmetries in Elementary 
Particle Physics (Ettore Majorana). New York, Academic (1966), pp. 111-174. 
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The representation of Fermi particles by two-component Pauli spinors satisfying a second order differential 
equation and the suggestion that in @ decay these spinors act without gradient couplings leads to an essen- 
tially unique weak four-fermion coupling. It is equivalent to equal amounts of vector and axial vector coup- 
ling with two-component neutrinos and conservation of leptons. (The relative sign is not determined 
theoretically.) It is taken to be ‘‘universal’”’; the lifetime of the u agrees to within the experimental errors of 
2%. The vector part of the coupling is, by analogy with electric charge, assumed to be not renormalized by 
virtual mesons. This requires, for example, that pions are also ‘‘charged”’ in the sense that there is a direct in- 
teraction in which, say, a 7° goes to 7~ and an electron goes to a neutrino. The weak decays of strange par- 
ticles will result qualitatively if the universality is extended to include a coupling involving a A or = fermion. 
Parity is then not conserved even for those decays like K—»2m or 39 which involve no neutrinos. The theory 
is at variance with the measured angular correlation of electron and neutrino in He’, and with the fact that 


fewer than 10~ pion decay into electron and neutrino. 


HE failure of the law of reflection symmetry for 

weak decays has prompted Salam, Landau, and 
Lee and Yang! to propose that the neutrino be described 
by a two-component wave function. As a consequence 
neutrinos emitted in 8 decay are fully polarized along 
their direction of motion. The simplicity of this idea 
makes it very appealing, and considerable experimental 
evidence is in its favor. There still remains the question 
of the determination of the coefficients of the scalar, 
vector, etc., couplings. 

There is another way to introduce a violation of 
parity into weak decays which also has a certain 
amount of theoretical raison @éire. It has to do with 
the number of components used to describe the electron 
in the Dirac equation, 


(iv —A)p=my. (1) 


Why must the wave function have four components? 
It is usually explained by pointing out that to describe 
the electron spin we must have two, and we must also 
represent the negative-energy states or positrons, 
requiring two more. Yet this argument is unsatisfactory. 
For a particle of spin zero we use a wave function of 
only one component. The sign of the energy is deter- 
mined by how the wave function varies in space and 
time. The Klein-Gordon equation is second order and 
we need both the function and its time derivative to 
predict the future. So instead of two components for 
spin zero we use one, but it satisfies a second order 
equation. Initial states require specification of that one 
and its time derivative. Thus for the case of spin } we 
would expect to be able to use a simple two-component 
spinor for the wave function, but have it satisfy a 
second order differential equation. For example, the 
wave function for a free particle would look like 
U exp[—i(Ei—P-x)], where U has just the two 
components of a Pauli spinor and whether the particle 


1 A. Salam, Nuovo cimento 5, 299 (1957); L. Landau, Nuclear 


Phys. 3, 127 (1957); T. D. L oN. ep 
1671 (ioany | 7); T. D. Lee and C. N. Yang, Phys. Rev. 105, 


refers to electron or positron depends on the sign of E 
in the four-vector p,= (E,P). 
In fact it is easy to do this. If we substitute 


1 
y=—(iV—A+m)x (2) 
m 


in the Dirac equation, we find that x satisfies 
(iv —A)’x=[(GVp—Ay)* (tVu— Ay) 
—}0 pF w x= mx, (3) 


where F,,=0A,/d%,—0A,/d%, and oy=41(yu¥o— Yu): 
Now we have a second order equation, but x still has 
four components and we have twice as many solutions 
as we want. But the operator ys=yz7,7 7: commutes 
with o,,; therefore there are solutions of (3) for which 
t¥sx=x and solutions for 7ysx= —x. We may select, 
say, the first set. We always take 


18x = x. (4) 
Then we can put the solutions of (3) into one-to-one 


correspondence with the Dirac equation (1). For each 
y there is a unique x; in fact we find 


x=F(1+iye (5) 


by multiplying (2) by 1+7zy; and using (4). The 
function x has really only two independent components. 
The conventional y requires knowledge of both x and 
its time derivative [see Eq. (2) ]. Further, the six oy 
in (3) can be reduced to just the three ozy, oy:, O22. Since 
On=ty2¥t= torts, Eq. (4) shows that o.¢ may be 
replaced by icz, when operating on x as it does in (3) 
Let us use the representation 


1 0 0 o\ | 01 
v=( y: v=( ). in=—( ): 
0 -1 —a 0 1 0 


where gz,y,- are the Pauli matrices. If 


v-('), 
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where a, b are two-component spinors, we find from 


(5) that 
( : ) 
x= ) 
-—¢ 


where g=4(a—0). Our Eq. (3) for the two-component 
spinor ¢ is 


[iVi~ Ay)?+0- (B+iE) ]p=m’ ¢, (6) 


where B,=Fy., Ez=Ft2, etc., which is the equation 
we are looking for. 

Rules of calculation for electrodynamics which 
involve only the algebra of the Pauli matrices can be 
worked out on the basis of (6). They, of course, give 
results exactly the same as those calculated with 
Dirac matrices. The details will perhaps be published 
later. 

One of the authors has always had a predilection for 
this equation.? If one tries to represent relativistic 
quantum mechanics by the method of path integrals, 
the Klein-Gordon equation is easily handled, but the 
Dirac equation is very hard to represent directly. 
Instead, one is led first to (3), or (6), and from there 
one must work back to (1). 

For this reason let us imagine that (6) had been 
discovered first, and (1) only deduced from it later. 
It would make no difference for any problem in electro- 
dynamics, where electrons are neither created nor 
destroyed (except with positrons). But what would we 
do if we were trying to describe 6 decay, in which an 
electron is created? Would we use a field operator y 
directly in the Hamiltonian to represent the annihi- 
lation of an electron, or would we use y? Now every- 
thing we can do one way, we can represent the other 
way. Thus if y were used it could be replaced by 


1 ¢ 
—(p~A+m)( ), (a) 
m -¢ 


while an expression in which ¢» was used could be 
rewritten by substituting 


2(1+775)y. (b) 


If » were really fundamental, however, we might be 
prejudiced against (a) on the grounds that gradients 
are involved. That is, an expression for 6 coupling which 
does not involve gradients from the point of view of y, 
does from the point of view of y. So we are led to 
suggest y as the field annihilation operator to be used in 
@ decay without gradients. If y is written as in (b), we 
see this does not conserve parity, but now we know that 
that is consistent with experiment. 

For this reason one of us suggested the rule’ that the 


2R. P. Feynman, Revs. Modern Phys. 20, 367 (1948); Phys. 
Rev. 84, 108 (1951). 

8R. P. Feynman, Proceedings of the Seventh Annual Rochester 
Conference on High-Energy Nuclear Physics, 1957 (Interscience 
Publishers, Inc., New York, 1957). 
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electron in 8 decay is coupled directly through ¢, or, 
what amounts to the same thing, in the usual four- 
particle coupling 


XCi(Pn0vr) OW), (7) 


we always replace y. by 4 (1+ 775). 

One direct consequence is that the electron emitted 
in 6 decay will always be left-hand polarized (and the 
positron right) with polarization approaching 100% 
as vc, irrespective of the kind of coupling. That is a 
direct consequence of the projection operator 


a=3(1+7y5). 


A priori we could equally well have made the other 
choice and used 


d=3 (1-15); 


electrons emitted would then be polarized to the right. 
We appeal to experiment‘ to determine the sign. 
Notice that a?=a, da=0. 

But now we go further, and suppose that ithe same rule 
applies to the wave functions of all the particles entering 
the interaction. We take for the 6-decay interaction 
the form 


LC.(apnOiay p) (aY,0 04), 


and we should like to discuss the consequences of this 
hypothesis. 

The coupling is now essentially completely deter- 
mined. Since ay=Wda, we have in each term expressions 
like 40,2. Now for S, T, and P we have O; commuting 
with y,; so that d0O,a=O,da=0. For A and V we have 
2O0,a=O,a’=O,a and the coupling survives. Further- 
more, for axial vector O;=77y7s, and since 7ysa=a, we 
find O;a=y,a; thus A leads to the same coupling as V: 


(8) IG WnYudyp) (Druay °» (8) 


the most general 6-decay interaction possible with our 
hypothesis.® 

This coupling is not yet completely unique, because 
our hypothesis could be varied in one respect. Instead 
of dealing with the neutron and proton, we could have 
made use of the antineutron and antiproton, con- 
sidering them as the ‘‘true particles.” Then it would be 
the wave function ya of the antineutron that enters 
with the factor a. We would be led to 


(8) IG (Paruavan) (Vruawe) : (9) 


This amounts to the same thing as 


(8) IG (Pn Vudyp) (Pryuaye) ’ (9’) 


and from the a priori theoretical standpoint is just as 
good a choice as (8). 
We have assumed that the neutron and proton are 


4See, for exaniple, Boehm, Novey, Barnes, and Stech, Phys. 
Rev. 108, 1497 (1957). 

5 A universal V, A interaction has also been proposed by E.C. G. 
Sudarshan and R. E. Marshak (to be published). 
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either both “particles” or both “antiparticles.” We 
have defined the electron to be a “particle” and the 
neutrino must then be a particle too. 

We shall further assume the interaction “universal,” 
so for example it is 


(8) 1G (Vurruayy) (PrYuahe) (10) 


for » decay, as Currently supposed; the yw is then a 
particle. Here the other choice, that the uw is an anti- 
particle, leads to (8)'GWypavc) Weyyave), which is 
excluded by experiment since it leads to a spectrum 
falling off at high energy (Michel’s p=0). 

Since the neutrino function always appears in the 
form ay, only neutrinos with left-hand spin can exist. 
‘That is, the two-component neutrino theory with 
conservation of leptons is valid. Our neutrinos spin 
oppositely to those of Lee and Yang.® For example, a 
g particle is a lepton and spins to the left; emitted with 
it is an antineutrino which is an antilepton and spins 
to the right. In a transition with AJ=0 they tend to go 
parallel to cancel angular momentum. This is the 
angular correlation typical of vector coupling. 

We have conservation of leptons and double 6 
decay is excluded. 

There is a symmetry in that the incoming particles 
can be exchanged without affecting the coupling. Thus 
if we define the symbol 


(AB)(CD)= (bayuovs) (Vevsav), 


we have (AB)(CD) = (CB)(AD). (We have used anti- 
commuting y’s; for C-number y’s the interchange gives 
a minus sign.’) 

The capture of muons by nucleons results from a 
coupling (#p)(jp). It is already known that this 
capture is fitted very well if the coupling constant and 
coupling are the same as in @ decay.® 

If we postulate that the universality extends also 
to the strange particles, we may have couplings such 
as (A%) (iu), (A°p)(#e), and (A°p) (pn). The (A%) 
might be replaced by (27), etc. At any rate the 
existence of such couplings would account quali- 
tatively for the existence of all the weak decays. 
Consider, for example, the decay of the K+. It can go 
virtually into an anti-A° and a proton by the fairly strong 
coupling of strange particle production. This by the 
weak decay (A°p)(jn) becomes a virtual antineutron 
and proton, These become, on annihilating, two or 
three pions. The parity is not conserved because of the 

* This is only because they used S and 7 couplings in 6 decay; 


had they used V and A, their theory would be similar to ours, with 
left-handed neutrinos. 


™We can express (AB)(CD) directly in terms of the two-com- 
ponent spinors ¢: (AB)(CD)=4(ea* es) (ect en) —4( eat ove) 


*(gc*e yp). If we put eam(4'); etc., where A; and A; are com- 


2 
plex numbers, we obtain 8(A:*C:*—A2*C;*)(B,Dz2—B3D;) and 
the symmetry is evident, 
* See, for example, J. L. Lopes, Phys. Rev. (to be published); 
L, Michel, Progress in Cosmic-Ray Physics, edited by J. G. Wilson 
(Interscience Publishers, Inc., New York, 1952), Vol. 1, p. 125. 
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a in front of the nucleons in the virtual transition. The 
theory in which only the neutrino carries the a cannot 
explain the parity failure for decays not involving 
neutrinos (the 7-6 puzzle). Here we turn the argument 
around; both the lack of parity conservation for the 
K and the fact that neutrinos are always fully polarized 
are consequences of the same universal weak coupling. 

For 6 decay the expression (8) will be recognized as 
that for the two-component neutrino theory with 
couplings V and A with equal coefficients and opposite 
signs [expression (9) or (9’) makes the coupling 
V+A]. The coupling constant of the Fermi (V) part 
is equal to G. This constant has been determined? from 
the decay of O" to be (1.412:0.01)10-® erg/cm’. 
In units where #=c=1, and M is the mass of the 
proton, this is 


G= (1,010.01) K10-5/M?. (11) 


At the present time several 8-decay experiments seem 
to be in disagreement with one another. Limiting 
ourselves to those that are well established, we find 
that the most serious disagreement with our theory is 
the recoil experiment in He® of Rustad and Ruby” 
indicating that the T interaction is more likely than 
the A. Further check on this is obviously very desirable. 
Any experiment indicating that the electron is not 
100% left polarized as vc for any transition allowed 
or forbidden would mean that (8) and (9) are incorrect, 
An interesting experiment is the angular distribution 
of electrons from polarized neutrons for here there is an 
interference between the V and A contributions such 
that if the coupling is V—A there is no asymmetry, 
while if it is V-+A there is a maximal asymmetry. This 
would permit us to choose between the alternatives (8) 
and (9). The present experimental results! agree with 
neither alternative. 

We now look at the muon decay. The fact that the 
two neutrinos spin oppositely and the p parameter is $ 
permitted us to decide that the yp is a lepton if the 
electron is, and determines the order of (j,v) which 
we write in (10). But now we can predict the direction 
of the electron in the r-—y-+ i >e-+v+% sequence. 
Since the muon comes out with an antineutrino which 
spins to the right, the muon must also be spihning to 
the right (all senses of spin are taken looking down the 
direction of motion of the particle in question). When 
the muon disintegrates with a high-energy electron the 
two neutrinos are emitted in the opposite direction. 
They have spins opposed. The electron emitted must 
spin to the left, but must carry off the angular mo- 
mentum of the muon, so it must proceed in the direction 
opposite to that of the muon, This direction agrees with 
experiment. The proposal of Lee and Yang predicted 


* Bromley, Almquist, Gove, Litherland, Paul, and Ferguson, 
Phys. Rev. 105, 957 (1957). 
10B. M. Rustad and S. L. Ruby, Phys. Rev. 97, 991 (1955). 
 Burgy, Epstein, Krohn, Novey, Raboy, Ringo, and Telegdi, 
Phys. Rev. 107, 1731 (1957). 


420 


196 Ra Px 


the electron spin here to be opposite to that in the case 
of 6 decay. Our @-decay coupling is VY, A instead of S, T 
and this reverses the sign. That the electron have the 
same spin polarization in all decays (8, muon, or 
strange particles) is a consequence of putting ay, in 
the coupling for this particle. It would be interesting 
to test this for the muon decay. 

Finally we can calculate the lifetime of the muon, 
which comes out 


7 == 192713/G'yu5 = (2,260.04) K 10-6 sec 


using the value (11) of G. This agrees with the experi- 
mental lifetime” (2.22+0.02) X 10~® sec. 

It might be asked why this agreement should be so 
good. Because nucleons can emit virtual pions there 
might be expected to be a renormalization of the 
effective coupling constant. On the other hand, if 
there is some truth in the idea of an interaction with a 
universal constant strength it may be that the other 
interactions are so arranged so as not to destroy this 
constant, We have an example in electrodynamics. Here 
the coupling constant e to the electromagnetic field 
is the same for all particles coupled. Yet the virtual 
mesons do not disturb the value of this coupling 
constant. Of course the distribution of charge is altered, 
so the coupling for high-energy fields is apparently 
reduced (as evidenced by the scattering of fast electrons 
by protons), but the coupling in the low-energy limit, 
which we call the total charge, is not changed. 

Using this analogy to electrodynamics, we can see 
immediately how the Fermi part, at least, can be made 
to have no renormalization. For the sake of this dis- 
cussion imagine that the interaction is due to some 
intermediate (electrically charged) vector meson of 
very high mass My. If this meson is coupled to the 
“current” (P,y,a¥n) and (Pyy,ay,) by a coupling 
(4mf?)!, then the interaction of the two “currents” 
would result from the exchange of this “meson” if 
dn f?M c?= (8)'G. Now we must arrange that the total 
current 


I= Wayuawn) + (Dy ahe)-+ yay.) + tao (12) 


be not renormalized. There are no known large inter- 
action terms to renormalize the (fe) or (#p), so let us 
concentrate on the nucleon term. This current can be 
split into two: J,=4(J,"+J,4), where J,” =Poy,. and 
J ,4A= pty sy Wn. The term J,” =Py,71y, in isotopic spin 
notation, is just like the electric current. The electric 
current is 


I= wn (+ 72). 


The term 4yy,y is conserved, but the term Py,7. is 
not, unless we add the current of pions, zL¢*T.V,u¢ 
— (V,¢*)T,¢], because the pions are charged. Likewise 
Py.u7+¥ is not conserved but the sum 


J" =by,.74+iLe*T.V.e— (Vi¢)*T¢] (13) 


2 W. E. Bell and E. P. Himcks, Phys. Rev. 84, 1243 (1951). 
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is conserved, and, like electricity, leads to a quantity 
whose value (for low-energy transitions) is unchanged 
by the interaction of pions and nucleons. If we include 
interactions with hyperons and K particles, further 
terms must be added to obtain the conserved quantity. 

We therefore suppose that this conserved quantity 
be substituted for the vector part of the first term in 
(12). Then the Fermi coupling constant will be strictly 
universal, except for small electromagnetic corrections. 
That is, the constant G from the » decay, which is 
accurately V—A, should be also the exact coupling 
constant for at least the vector part of the f decay. 
(Since the energies involved are so low, the spread in 
space of J,” due to the meson couplings is not 
important, only the total “charge.’’) It is just this part 
which is determined by the experiment with O*, and 
that is why the agreement should be so close. 

The existence of the extra term in (13) means that 
other weak processes must be predicted. In this case 
there is, for example, a coupling 


(8)'Gi(g*V,.T+.e— (Vie)*T+¢) rue) 


by which a 77 can go to a 7° with emission of # and e. 
The amplitude is 


4G (p.-+?,°) (Lv uaye), 


where p-, p° are the four-momenta of mw and f°. 
Because of the low energies involved, the probability 
of the disintegration is too low to be observable. To 
be sure, the process 7 —-7°-++e-+% could be understood 
to be qualitatively necessary just from the existence of 
8B decay. For the 7~ may become virtually an anti- 
proton and neutron, the neutron decay virtually to a 
proton, ¢, and 7 by @ decay and the protons annihilate 
forming the 7°. But the point is that by our principle 
of a universal coupling whose vector part requires no 
renormalization we can calculate the rate directly 
without being involved in closed loops, strong couplings, 
and divergent intervals. 

For any transition in which strangeness doesn’t 
change, the current J,” is the total current density of 
isotopic spin T,. Thus the vector part gives transitions 
AT=0 with square matrix element T(T+1)—T.T,’ 
if we can neglect the energy release relative to the rest 
mass of the particle decaying. For the nucleon and 
K-—K°+e+5 the square of the matrix element is 1, 
for the pion and 2-—>2°-+- e+ 3 it is 2. The axial coupling 
in the low-energy limit is zero between states of zero 
angular momentum like the 7 meson or O%, so for both 
of these we can compute the lifetime knowing only the 
vector part. Thus the 7 —7°+e+% decay should have 
the same ft value as O”. Unfortunately because of the 
very small energies involved (because isotopic spin is 
such a good quantum number) none of these decays 
of mesons or hyperons are fast enough to observe in 
competition to other decay processes in which T or 
strangeness changes. 
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‘This principle, that the vector part is not renormal- 
ized, may be useful in deducing some relations among 
the decays of the strange particles. 

Now with present knowledge it is not so easy to say 
whether or not a pseudovector current like Piysyy7.y 
-an be arranged to be not renormalized. The present 
experiments” in @ decay indicate that the ratio of the 
coupling constant squared for Gamow-Teller and Fermi 
is about 1.30.1. This departure from 1 might be a 
renormalization effect. On the other hand, an interest- 
ing theoretical possibility is that it is exactly unity 
and that the various interactions in nature are so 
arranged that it need not be renormalized (just as for 
1), It might be profitable to try to work out a way of 
doing this. Experimentally it is not excluded. One 
would have to say that the ft, value of 1220150 
measured® for the neutron was really 1520, and that 
some uncertain matrix elements in the 6 clecay of the 
inirror nuclei were incorrectly estimated. 

The decay of the w~ into a w and # miglit be under- 
stood as a result of a virtual process in which the r 
becomes a nucleon loop which decays into the u+#. 
In any event one would expect a decay ito e+? also. 
The ratio of the rates of the two processes can be 
calculated without knowledge of the character of the 
closed loops. It is (me/my)?(1~m,?/m,?) ?= 13.6 1073. 
l:xperimentally’® no m—e-+v have been found, indi- 
cating that the ratio is less than 1075, This is a very 
serious discrepancy. The authors have no idea on how 
it can be resolved. 

We have adopted the point of view that the weak 
interactions all arise from the mteraction of a current 
/, with itself, possibly via an intermediate charged 
vector meson of high mass. This has the consequence 
that any term in the current must interact with all the 
rest of the terms and with itself. To account for 6 decay 
and u decay we have to introduce the terms in (12) into 
the current; the phenomenon of uw capture must then 
also occur. In addition, however, the pairs ev, uv, and px 
must interact with themselves. In the case of the 
(év)(ve) coupling, experimental detection of electron- 
neutrino scattering might some day be possible if 
clectron recoils are looked for in materials exposed to 
pile neutrinos; the cross section’? with our universal 
coupling is of the order of 107 cm?. 


A. Winther and O, Kofoed-Hansen, Kgl. Danske Vidensakb. 
Selskab, Mat.-fys. Medd. (to be published). 

“This slight inequality of Fermi and Gamow-Teller coupling 
Constants is not enough to account for the experimental results 
y reference 11 on the electron asymmetry in polarized neutron 

ecay. 

** Spivac, Sosnovsky, Prokofiev, and Sokolov, Proceedings of the 
International Conference on the Peaceful Uses of Atomic Energy, 
Geneva, 1955 (United Nations, New York, 1956), A/Conf. 8/p/650, 

“C. Lattes and H. L. Anderson, Nuovo cimento tis be 
published), 

" For neutrinos of energy w (in units of the electron mass ) the 
total cross section is cow?/(1-+-2w), and the spectrum of recoil 
energies ¢ of the electron is uniform de. For antineutrinos it is 
70(w/6)[1—(1-+2)-*] with a recoil spectrum varying as 
(+e~e?, Here o=2G%m"/e=8.3X10- cm’. 
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To account for all observed strange particle decays 
it is sufficient to add to the current a term like (pA°), 
(p2°), or (2-2), in which strangeness is increased by 
one as charge is increased by one. For instance, (pA°) 
gives us the couplings (pA°)(év), (fA°)(av), and 
(pA°)("%p). A direct consequence of the coupling 
(pA°) (év) would be the reaction 
A—>pteti (14) 
at a rate 5.310" sec™!, assuming no renormalization 
of the constants.!® Since the observed lifetime of the A° 
(for disintegration into other products, like p+7, 
n+) is about 3X 10~" sec, we should observe process 
(14) in about 1.6% of the disintegrations. This is not 
excluded by experiments. If a term like (2x) appears, 
the decay =~—+n+e~-+y is possible at a predicted rate 
3.5108 sec™! and should occur (for re-=1.6107° 
sec) in about 5.6% of the disintegrations of the 2~. 
Decays with u replacing the electron are still less 
frequent. That such disintegrations actually occur at 
the above rates is not excluded by present experiments. 
It would be very interesting to look for them and to 
measure their rates. 

These rates were calculated from the formula 
Rate= (2G°W*c/30m3) derived with neglect of the 
electron mass. Here W=(M,’—M,°)/2M, is the 
maximum electron energy possible and c is a correction 
factor for recoil. If x=W/Ms it is 
c= —48075(1—22)? In(1 — 22) 

— gu (1-2) (3 — 6% — 222) , 
and equals 1 for small 2, about 1.25 for the = decay, and 
2.5 for M,=0. 

It should be noted that decays like Tt—w+et+y 
are forbidden if we add to the current only terms for 
which AS=-+1 when AQ=-++1. In order to cause such 
a decay, the current would have to contain a term with 
AS=-—1 when AQ=+1, for example (Stn). Such a 
term would then be coupled not only to (#e), but also 
to all the others, including one like (pA°). But a coupling 
of the form (=+7)(A°p) leads to strange particle decays 
with AS=+2, violating the proposed rule AS=-+1. 
It is important to know whether this rule really holds; 
there is evidence for it in the apparent absence of the 
decay = —27-+n, but so few = particles have been 
seen that this is not really conclusive. We are not sure, 
therefore, whether terms like (E+) are excluded from 
the current. 

We deliberately ignore the possibility of a neutral 
current, containing terms like (ée), (ge), (sin), etc., 
and possibly coupled to a neutral intermediate field. 
No weak coupling is known that requires the existence 
of such an interaction. Moreover, some of these 
couplings, like (ée) (fe), leading to the decay of a muon 
into three electrons, are excluded by experiment. 

It is amusing that this interaction satisfies simul- 
taneously almost all the principles that have been 


{8 R. E. Behrends and C. Fronsdal, Phys. Rev. 106, 345 (1957). 
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proposed on simple theoretical grounds to limit the 
possible 8 couplings. It is universal, it is symmetric, it 
produces two-component neutrinos, it conserves leptons, 
it preserves invariance under CP and T, and it is the 
simplest possibility from a certain point of view (that 
of two-component wave functions emphasized in this 
paper). 

These theoretical arguments seem to the authors to be 
strong enough to suggest that the disagreement with 
the He® recoil experiment and with some other less 
accurate experiments indicates that these experiments 
are wrong. The z—e+% problem may have a more 
subtle solution, 

After all, the theory also has a number of successes. 
It yields the rate of « decay to 2% and the asymmetry 
in direction in the r—>u—e chain. For 6 decay, it agrees 
with the recoil experiments” in A** indicating a vector 
coupling, the absence of Fierz terms distorting the 
allowed spectra, and the more recent electron spin 
polarization* measurements in @ decay. 


19 Herrmansfeldt, Maxson, Stahelin, and Allen, Phys. Rev. 107, 
641 (1957). 


FEYNMAN AND M. GELL-MANN 


Besides the various experiments which this theory 
suggests be done or rechecked, there are a number of 
directions indicated for theoretical study. First it is 
suggested that all the various theories, such as meson 
theory, be recast in the form with the two-component 
wave functions to see if new possibilities of coupling, 
etc., are suggested. Second, it may be fruitful to analyze 
further the idea that the vector part of the weak 
coupling is not renormalized; to see if a set of couplings 
could be arranged so that the axial part is also not 
renormalized; and to study the meaning of the trans- 
formation groups which are involved. Finally, attempts 
to understand the strange particle decays should be 
made assuming that they are related to this universal 
interaction of definite form. 
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CONSEQUENCES OF SU, SYMMETRY IN WEAK INTERACTIONS 


R.P. Feynman, 


California Institute of Technology. 


Introduction tst LECTURE 


These lectures will cover the relationship of SUs and the weak inter- 

actions. The lectures will be geared for experimental people so that they 

may get an idea of how our theoretical predictions arise. At first, I will 
speak a little bit about how calculations are made for the weak decays, then 

we shall consider SU;, and finally the effects of unitary symmetry upon the 
weak interactions. The subject matter is split in this way so that one may 
get a clearer idea of the origins of the various problems in the theory of 

weak interactions. Not all of our difficulties arise from SU; nor do all 


of the successes, and it is important to realize this. 


The theory of weak decays is very unsatisfactory except that it 
agrees with experiment. To understand that remark let us consider the muon, 
A muon is a particle which has exactly the same properties as the electron 
except that its mass is 207 times the mass of the electron. This statement 
completely describes our experiments with the muon, but such a comment is also 
unsatisfactory for a true theorist. Experimentalists find a beautiful and 
simple thing which is easy for the theorists to describe. Nevertheless, we 
must be unhappy about this situation because we have no idea of why this 
particle exists. Similarly, the theory of weak decay, up to the point where 
we encounter strangeness changing interactions, is accurate but unsatisfactory. 
There are various mysterious properties which I shall mention as I go on, but 
I should like to remind you that the most mysterious aspect of the weak decays 
is that they exist at all. It seems so much simpler just to forget them. 
There is no clue from electromagnetism, from gravity, or from nuclear forces 
that the weak interactions must exist. They seem to have no connection 


with the rest of the world. 
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Form of the four-fermion weak interaction 


The theory of the weak decays was originated by Fermi. Fermi tried 
to describe mathematically an idea of Pauli's concerning the neutrino. For 
example, in K capture an orbital electron in the K shell is eaten by a proton 
in the nucleus with the result that a neutron plus a neutrino is created. 
Fermi attempted to write down an abstract description of this process. He 


assumed that there was a Hamiltonian which contained a term of the form 
*x Q* 
% QQ, a. 


In this form the Q* is an operator which creates a neutron, a. is an operator 
which destroys a proton, Q* creates a neutrino, and Q destroys an electron. 
Such a term is present in a Hamiltonian of the world with a certain strength 
characterized by a coupling constant G, and gives rise to a certain amplitude 
for the K-capture reaction, but no mechanism is described by it. Another 


way of describing the reaction is to draw a diagram with a four-point coupling 


and to associate with that diagram the strength constant G, together with 

certain well-defined rules for calculating the amplitude. The decay amplitude 
is proportional to G and the amplitude for finding the proton and the electron 
together. Given the amplitude for the process, the rate at which the reaction 


proceeds is given by a Golden Rule of the form 
Rate = 2a|Amp|? (Density of final states). 


Actually things are a bit more complicated. A fermion, described by the 
Dirac equation, is represented by four amplitudes, two spin possibilities 
times two charge possibilities. The product of four such Q's will produce 4° 


possible terms, each of which might have its own characteristic coupling 
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constant. Actually such a horrible mess is simplified by requiring rela- 
tivistic invariance of the amplitude. This reduces the number of possible 
coefficients to ten. Mathematically, if we represent the four-component 

spinor wave function of a particle by its name, the possible forms for the 


coupling can be represented as 


(vie) (nlp), (vw, e) (typ), (vle) (nysp) , 


and so on. Fermi, however, was not satisfied with writing down all the 


possibilities. He made a guess that the interaction was vector 
(vy,e) (ny,p) - 


Having limited the form of the amplitude by this guess, he was then able to 
calculate the properties of the beta decay. 


Parity non-conservation 

The most prominent feature is the shape of the spectrum, which 
depends almost wholly on the density of final states. Early experiments 
failed to confirm this spectrum, but those experiments were incorrect because 
of rescattering in the foils. The subsequent history of beta-decay investiga- 
tions is beclouded by two decades of inaccurate observations and poor theoretical 
suggestions, which I shall not discuss. The final resolution of the puzzle 


is simply to multiply each wave function of a particle by the left-handed 


helicity operator, 


as (t+ iys)/2. 


This prescription reduces to one the number of independent coupling constants 


in beta decay. 


The helicity operator has certain properties, namely 


Using this helicity operator in the interaction also results ina 
violation of parity conservation in the weak interaction, which was found in 


1957. The final form for the beta-decay coupling in the Lagrangian may then be 
written as 
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VB Glavy_ae) (any_ap) . 


With such a coupling, the particles are emitted polarized along the direction 
of their motion. They are spinning to the left with a probability of (1+ v)/2, 
and to the right with a probability of (1 - v)/2, where v is the particle 
velocity in units of c. The opposite polarization holds for the antiparticles. 


For beta decay the above Lagrangian may be written in the form 


ag ¥ (t, tings) e- HG ? tr grep | 


and it is common today to identify the various terms in this formula with 
current-density operators. People like to define the vector current of the 
leptons to be sy = (vy,e), and the axial vector current of the leptons as 
Jn = (vy, ise), together with similar definitions of the nuclear vector and 


axial vector currents. With this symbolism the beta-decay interaction takes 


the form 


nosed eae 
2u 


To discuss the decay of the muon we need only replace the proton spinor by a 
neutrino spinor and the neutron spinor by a muon spinor in the above coupling. 
I shall not go through the calculation of the spectrum and polarization proper- 
ties of the decay on the basis of this proposed theory, but shall only comment 
on the results. This spectrum agrees very well with the experiments which I 
have seen. In fact the agreement between experiment and theory today is so 
detailed that one has to take into account the radiative corrections to the 
spectrum. The absolute value of the coupling constant - » can be aed 
from the rate of the decay of the muon, and the result is “GM? = 1.01 -107- 

(I use natural units in which K = c = t.) 


Strong intcraction modification of weak decay matrix 
elements, and the conserved vector current theory 


Let us return to the beta decay of the neutron. In this case we 
have two kinds of currents, vector and axial vector, which couple to the 


leptons. In beta decay the nucleons are non-relativistic and we can simplify 
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the calculation of matrix elements. The four-dimensional vector current for 
a non-relativistic »varticle is dominated by its time component, which is just 
the operator 1, while the axial vectur current is dominated by its space-like 
terms, which are just the spin operators. The vector term does not change 
the spin of the nucleon and is called the Fermi coupling, while the axial 
vector term, called Gamow-Teiler coupling, flips the spin of the nucleon. 
Because each tyve of coupling leeds to distinctive selection rules it was 
known quite early that both types of coupling are present. In the muon decay 
the ratio of the axial-vector term tu the vector term is unity, but from 
experiments on the disintegration of tne neutron and o'* it was shown that 
the ratio in nuclear-beta decay differs substantially from unity. If we 


calculate the properties of neutron decay using the coupling 


au (ta* igs) | . l (Gyr, - iCrats ) P| ; 


we come to the conclusion that for polarized neutrons if Gy = - Gy» the 
electron is emitted isotropically, but the neutrino is much more likely to 
come out along the direction of the neutron spin than opposite. The recoil 
proton is also anisotropic, but the electron is not. Actually the electron 
is slightly unsymmetric in its emission direction. From measurements on the 
decay of polarized neutrons one can then find that the ratio of the coupling 
constants is negative and approximately 1.2. What a destruction of the 
beautifully simple theory! (1+ iys )/2 is very pretty because its square 
equals itself, and it projects out a certain helicity component, but 

(1+ 1.2 iys )/2 is dirty. However, this situation is really not unexpected, 
because the nucleons are fairly complicated particles due to strong inter- 
actions about which we know little. Indeed, in the sense of perturbation 
theory, it would be expected that a fundi.mental simple-type of coupling 

would no longer be simple. In other words, it is still possible that 

in the ddep heart of matter the coupling formula for the nucleons involves 
the helicity projection operator, but that the spinors which entered 


do not revresent the real proton and neutron but rather some kind of 
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idealized p and n. In calculating matrix elements for the real nucleons, 
corrections would then have to be made for pions and other strongly interact- 
ing particles which would renormalize the relative coefficient of the vector 


and axial vector current. 


Strong interactions in fact would be expected to renormalize not 
only the axial current but also the vector current. After some mental effort 
one can see, however, that it is quite possible that the vector current need 
not be renormalized at all, as experimental clues hinted, when it was found 
that the muon coupling constant was the same as the Fermi coupling constant 
to within a few per cent. The extraction of the Fermi constant from the rate 
of 0'* beta decay involves an additional assumption about nuclear forces 
unless the vector current for the weak interactions is not renormalized. In 
the decay the parent nucleus contains eight protons and six neutrons, while 
the daughter nucleus contains seven protons and seven neutrons. Now, there 
is a state in N'* which is the same as the ground state in 0'* in the sense 
of isotopic spin. The nuclear forces are independent of whether the nucleons 
are protons or neutrons, and therefore in every system of seven protons and 
seven neutrons there is a state which has essentially the same character as 
any state that exists for eight protons and six neutrons. (The inverse is 
not true because of the exclusion principle. ) The sister state of the ground 
state of N'*, which has isotopic spin f, lies lower in energy than that state 
and so the nitrogen nucleus does decay to its sister state. Since the kine- 
matic features of the wave functions of the initial and final states are the 
same, the matrix element of the integral of the beta-decay charge density 
involves simply an isotopic spin Clebsch-Gordan coefficient, which is just 2. 
The calculation is not exact because the violation of isotopic spin invariance 
by electromagnetism destroys the perfect overlap of the wave functions, but 


this effect is at most a half per cent. 


This matrix element is almost the sole example of an accurate calcula- 
tion in nuclear physics, because one does not have to know any nuclear physics 
to calculate it. This trick is possible only at very low momentum transfer 


where the integral of the fourth component of the charge density becomes the 
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operator for the total charge. Since the isotopic spin is conserved by the 


strong interactions we have therefore no renormalization of such matrix elements. 


The assumption that the vector part of the beta-decay current is a 
component of the isotopic-spin current, implies also that two pions must have 
a well-defined weak coupling to two leptons. It is quite easy to see how 
strong the coupling is by using our electricity analogue. Since the isotopic 
Spin of the pion is the same as that for ot’, namely one, we again get a factor 
of 42 for the matrix element. Similarly, there will be matrix elements of the 
currents between two kaons, two 2 hyperons, and so on. We propose that this, 
in fact, is the way the world works and, consequently, that the vector part of 


the beta-decay coupling which conserves hypercharge is not renormal ized. 


Electromagnetic corrections to weak 
interaction matrix elements 


When the coupling constants for the Fermi beta decay and that for 
the muon beta decay are determined in the manner indicated, they turn out not 
to be equal as we had expected, but to differ by a few per cent. To be more 
accurate, the vector coupling constant determined from the rate of decay of o'* 
is 0.985 times the muon beta-decay constant, whereas, that determined from the 
decay of At?® is 0.975 oie In doing calculations we have a problem with certain 
relativistic electromagnetic corrections. We are trying to investigate a 
difference of a few per cent and the order of magnitude of electromagnetic effects 
is the same. Because the agreement between the two coupling constants is so 
close, one is reluctant to guess that the idea of equality is wrong... . One 
would prefer to speculate that the discrepancy may be due to something else, such 
aS a misunderstanding in the calculation of the electromagnetic corrections. 
It is therefore worth while to discuss the question of how accurately we know 
such corrections. In computing them for the decay of the muon we have no 
problem; the electromagnetic structure of the muon is well known and it has no 
other anomalous moments. The calculation goes through straightforwardly. 
However, in calculating proton corrections to the disintegration of the neutron 
we get into difficulties because the integrals diverge. The divergence is due 


to the anomalous magnetic moment of the nucleons. Now, when we have to compute 


such integrals, frankly we do not get completely satisfactory answers; depend- 


ing on where one puts the cut-off one might at first be able to account for any 
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discrepancy. However, in spite of this apparent uncertainty, it turns out that 


the answer is not very sensitive tu the cut-off. In fact, the analysis is 


not uncertain by more than about + ‘4% per cent. Even though the electromagnetic 


effects are larger than this, the biggest part of them can be understood without 
serious ambiguity. 


The uncertainty in the radiative correction is expected to be quite 
small because it is really a correction to a correction. Let me explain the 
origin of the major part of the correction by referring to the oxygen decay. 


In this case we would have the following types of diagrams 


In these diagrams we see that the virtual photon interacts both with an object 


of charge 7 and one of charge 8. The usual procedure would be to use a Coulomb 


wave function for the outgoing positrons (that is, for a field of a nucleus with 
charge 7). This results in a well-known correction to the f value. But the 
second diagram is not included correctly in this procedure. The approximation 
actually includes the following diagram 
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in which, outside the nucleus, an electron-positron pair is created by the 
virtual photon which interacts with a charge of seven units. These two 
diagrams A and B have to be added, and are comparable in strength fora 
relativistic positron. In fact, f accounts for the sum of A and C. The 
error is the difference of diagram B and C, which is a diagram like C but with 
charge { on the nucleus. When this is calculated, disregarding recoil of the 
nucleus, one obtains (e?/27) In (X/E) where E is the positron energy and X 
iso, Actually, the result should be modified for recoil and magnelic- 
moment corrections etc., the result being to replace X by some unknown of the 
order i If I stop the integral at about meats I will then have to add an 
unknown amount to account for the high energy piece of the integral. it is 
this unknown amount that is estimated to be of the order of % per cent. 

From my experience with such calculations I do not believe that this correction 
could be significantly larger than % per cent. The point is that the major 
part of the radiative correction is well known (the log is large because the 
lower limit E is so much smaller than Mm, and that the remaining part is very 
unlikely to account for the discrepancy between the nuclear and muon weak 


decay constants, even though the caiculation involves some uncertainty. 


Can the radiative correction calculation ever be made more »recisely? 
I believe that it can and would like to propose a programme for theorists. 
What is really needed is an evaluation of the accuracy of the approximation of 
keeping only a one-particle intermediate state in the calculation of matrix 
elements for the product of two currents. Many theorists profess to believe 
that such an approximation is quite good, but they really lack any reason to 
Support their stand. It is not possible to compute the degree of validity of 
this approximation, but if we analyse suitable experiments, we ought to be able 
to obtain a very good answer to our problem. One should conduct a careful 
investigation of high-energy Compton scattering off a proton, and of the hy er- 
fine structure of hydrogen. The amplitudes involved in these phenomena are 
Matrix elements of the square of the electromagnetic current. Now Platzman 
and Iddings have computed the hydrogen hyperfine splitting, keeping only the 


proton intermediate state and using the Hofstader form factors at each vertex, 
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and their result disagrees with experiment. However, their work is valuable 
because it provides an evaluation of the accuracy of this type of approximation, 
and it is on this basis that I believe our estimate of the radiative correction 
to nuclear beta decay is good to % per cent. It would be useful to have 
another check on this approximation by studying Compton scattering, and it is 
likely that such a study will provide additional substantiation for our estimate 


of the % per cent accuracy. 


Strangeness changing weak decays 2nd LECTURE 


The beta decay of nucleons and the decay of the muon are only two 
examples of the weak decays. There are many more and it is now our task to 
outline the possible terms which are required in the weak interaction coupling 
in order that all types of decays may be accounted for, at least qualitatively. 
To do this let us use a more convenient symbolism. Let us abbreviate the 
V-A current term [Ay (1 + iys)B] by (AB). In this language, to describe 
nuclear beta decay, we need a coupling term of the form (ve) (np), together 
with its Hermitian conjugate. In our discussion of neutron decay we have seen 
that such a form is correct only qualitatively. Indications are that the 
vector current is not renormalized and, consequently, that additional terms 


for other strongly interacting particles such as pions, K's, etc. must be present. 


We have speculated that the vector beta-decay current which is 
strangeness preserving is a suitable component of the isotopic spin current, 
which is conserved. For that reason the theory we have discussed is called 
the conserved vector current (CVC) theory. (It is well to remind ourselves 
that our main requirement was that the vector coupling constant should not be 
modified by the strong interactions. An easy way to guarantee this is to 
propose that the vector current is conserved, but that may not be the only way.) 
The axial vector coupling constant differs from its ideal value [1.26 instead 


of t ‘J. Some people consider this a small deviation so that they are tempted 


*) Bernardini has told me that Miss Wu now finds the ratio of the axial vector 
to the vector nuclear beta-decay coupling constants to be 1.16. 
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to supply some reason for such a relatively small renormalization. Many 
attempts have been made to construct a conserved axial current, but all have 
failed because of mass terms. However, it is by no means self-evident that we 
cannot set up the axial current so that its coupling is essentially unchanged 


by renormalization, even though such a current will not be conserved. 


The decay of the muon shows that the weak coupling must contain a 
term of the form (ve) (uv), and today it is known definitely that v, , which 
we might call the neutretto, is distinct from v, the neutrino. Since bound 
muons disappear faster than free ones, we also know that there must be a term 
like (v H) (np). By measuring the degree to which the muon lifetime is 
shortened in various nuclei, we can get a pretty fair idea of the strength of 
the muonic lepton coupling to nucleons. The experimental evidence allows one 
to assume that the strength of this coupling is the same as the electron beta- 


decay coupling strength, which is also the same as G to a few per cent. 


These three terms are sufficient to give rise to many other observed 
decays. For example, the modes 7 + Bae, and 7 + e+ v would be expected 
qualitatively because they can proceed through an intermediate NNstate which 
is coupled to Testone In fact, all weak decays which conserve hypercharge 
are predicted by the three types of weak coupling already considered. But to 
account for the observed strangeness changing decays we need to postulate at 
least three more types of weak coupling. Consider the following three decays 
of the K meson: K > paves K>a+er+v, and K*+at+7. We see that we need 
a strangeness changing term, which, without prejudice as to its true nature, 
we shall abbreviate by (Ap), coupled to both types of the leptons, and to 
strongly interacting particles with zero strangeness: (YH) (Ap), (ve) (Ap), 
and (pn) (Ap). These three terms are sufficient because all the strangeness 
changing weak decays, about which there is no dispute experimentally, obey the 
selection rule that the change in strangeness equals the change in the charge 


AS/AQ = +1. Now one would guess that the strength of the last three types of 


*) For example, a goes virtually to p and antineutron via strong interactions. 
The proton then goes to N, e and v, the neutron and antineutron annihilating. 


433 


434 


- 122 = 


weak coupling might ke the same as the first three which conserve hypercharge, 
but it turns out that this is not the case. The strangeness changing decay 
rates are weaker by an order of magnitude from what would be expected if the 


strength G were universal. 
The current-current theory 


These six terms are somewhat messy and the question is whether things 
can be organized in a more pleasing way. One idea that pops into view is to 


combine the four currents into one grand weak interaction current 
J, (ve)+ (vn) + a(pn)+ b(pA) , 


with the coefficients, a and b, to be determined by some symmetry principles, 

and to suggest that the weak coupling is simply a current-current interaction 
1//2 GI, Jy. In the cross-products one finds the six types of terms that are 
required experimentally. Such a proposal automatically eliminates neutral 
lepton currents, for which there is no evidence experimentally. One may ask 

why we write charged currents. The answer is that if we rewrite for example 
(ve) (uy) as -(vv,) (ue), and then pursue the idea of a current-current coupling, 
the decay K° + p+e would be predicted. Since sucha decay is definitely not 


seen, we feel that the charged current hypothesis is much to be preferred. 


The current-current interaction not only leads to the six desired 
cross terms from the four basic types of current, but also predicts four new 
types of parity non-conserving interactions, which are the diagonal terms in 
the product. Three of them will be extremely difficult to observe; for 
example, the diagonal term (ve) (ev) leads to a direct parity non-conserving 
scattering cross-section between neutrinos and electrons, but since the cross- 
section is so tiny and neutrinos are so hard to detect, the existence of sucha 
term has not yet been verified experimentally. However, experiments in nuclear 
physics can be designed which are extremely sensitive to the existence of a 
parity-violating contribution to the nuclear forces. Recently, Felix Boehm 
at Caltech has measured the circular polarization of the gamma rays produced 
in a partially forbidded Ml transition in a heavy ellipsoidal nucleus. This 


polarization can be non-zero only through an admixture of the parity forbidden 
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El] matrix element. Boehm found good evidence for a smail parity violation in 
the nuclear forces. Such an effect is predicted by the (np) (pn) diagonal 

term in the current-current theory. Because of strong renormalizations which 
are not calculable, it is impossible to state whether or not the size and sign 
of the experimental effect agree with the prediction of the current-current 
theory of weak interactions. The order of magnitude of the effect nevertheless, 


is correct, and hence this does constitute a qualitative verification of the 
hypothesis. 


If the weak interaction does have the current-current form an appeal- 
ing theoretical possibility is that a new vector meson exists, which mediates 
the interaction in the same way that the photon mediates the interaction between 


two charge currents. Such a new field will give rise through the following 
type of diagram 


to a weak interaction between two currents of the form 


ar 68 3, (Bap 4a 4e/™y ) Jel - my) - 


In this formula, my is the mass of the intermediate boson, and e 


W is its coupl- 
ing to the weak interaction current. 


For q < < My the interaction reduces 
to our current~-current form once we identify 4n ea? /my with g//2! 


The charged vector meson theory, like the four-fermion point inter- 
action theory, is not renormalizable, but this never bothers me. All of our 
theories are wrong at high energies, and renormalizability only refers to 
whether this universal disease can be swept under the rug for the analysis of 
low-energy phenomena. I personally have felt no particular favour for 
renormalizable theories, and will just as well accept one that is unrenormaliz- 


able as one that is renormalizable. It has never been proved that renormaliz- 
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able theories are superior because they are consistent; indeed, it seems to 
me likely that renormalizable theories suffer from ghost difficulties at high 
energies. I make a point of this because in my later analysis I shall con- 
tinue to discuss non-renormalizable theories, thus opposing the conventional 


practice, and shall make no apology for doing so. 


From the size of the weak-boson coupling constant ey, one can 
estimate the production rate of these particles. Once threshold energy has 
been passed sufficiently, they should be produced copiously enough to be seen 


readily if they exist at all. 


Pion decay 


Let us now turn to the question of the extension of the idea of non- 
renormalizability from the vector current to the axial current. At the same 
time we will study the treatment of another prominent non-strangeness changing 
decay, the decay of the pion. To clarify the ideas involved, we stick toa 
model of the universe in which nucleons and pions are the only strongly inter- 
acting particles. As already explained, one expects the pion to decay into 


leptons just because of the existence of the following type of diagram: 


In the coupling of the pion to the nucleon loop one usually sees the 
Dirac matrix gys. However, we shall come back later and discuss an altermate 
form for the vertex, which I prefer. The nucleon loop is coupled to leptons 
by both a Y,, Matrix anda Yyvs matrix. For this problem only the Yyvs term 
contributes, since the pion is a pseudoscalar particle. This loop cannot be 
calculated because it is divergent. Even if it were not, one would not 


believe the answer because pion corrections to it would be very important. 
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However, the form it gives for the matrix element 
V8r gE byt, avy) » 


where q is the momentum of the pion, is the only possible invariant form for 
the amplitude. The number Fo is the result of all possible diagrams. We 


do not know it nor do we know how to calculate it. 


Given this form for the amplitude we can calculate the decay rate 


using the Golden Rule given in the first lecture. It turns out to be 
2 
2 h-m 
ae ger es ; 
T ToT 2 2 
T 2 ny; m 


Using the known decay rate of the charged pion, one finds that 
-36 
Pome 5.95x 10 ° 


Incdientally, the same mathematics hold for the decay of kaons into leptons, 
. . a “6 
in which case Fim = 5.55x10 . 


To describe the decay of 7 + e+v, the amplitude would have the same 
invariant structure. You will note that if the coupling constants are of the 
same order for both (ev) and (uv) the rate for decay into (ev) is severely 
inhibited. The inhibition is due to the fact that the charged lepton must 
come out with its spin aligned along its direction of motion. This is very 
difficult for electrons since their velocity is quite high and, in the weak 
interactions, particles prefer to be polarized in the opposite sense. The 
probability of right-hand polarization goes as I- v/2. The muons, on the 
other hand, are not highly relativistic and, consequently, not so greatly 
inhibited. Theoretically one finds the ratio: 

Dm etY _ 1.36x 10", 


T+ uty, 


where this theoretical estimate includes some radiative corrections. 
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The theoretical estimate for the ratio of the decay rates assumes 
that the coupling of (ev) and (uvy,) is the same for weak interactions. Experi- 
ment agrees with that result to within two per cent; that means that within 
one per cent the electron and muon couple in the same way in the weak inter- 
actions. The pion decay experiment thus provides the best check of the 
hypothesis that the strength of the muon coupling is the same as the electron 
coupling in the weak interactions. This is a much better result than is 
obtained from studying yp capture in complicated nuclei. Note that this check 


is entirely independent of the value of Fo which is not calculable. 


Is there some theoretical argument for determining the pion decay 
amplitude Fi? It is a delightful problem because it is apparently hopeless. 
However, there appeared a paper by Goldberger and Treiman, who found a formula 
for Fe The argument given by Goldberger and Treiman is inadequate since they 
ignored terms of the same order as the ones that were kept. Similar formulae 
had also been discovered before by several other people who had not pursued 
them because they did not agree well with experiment. Goldberger and Treiman's 
contribution was to put renormalized constants into the formula, thus obtain- 


ing much better agreement with experiment. 


In this lecture I should like to describe my first approach toa 
derivation of the Goldberger-Treiman relation. In the next lecture I shall 


discuss Gell-Mann's refinement of this approach. 


To understand the ideas involved in this derivation, let us recall 
our attempt to construct the vector beta-decay current in such a way that it 
was not modified by renormalization. In our model universe of pions and 
nucleons one can Show that if the coefficient A of the pion current, in the 
combination (5% vt NC - 7° a7) is suitably chosen, the renormalized 
vector beta-decay coupling constant is the same as the bare vector-coupling 
constant. An obvious question is whether or not one can write the axial vector 
current in such a way that there is no renormalization. The answer to that, 
generally speaking, is that we cannot. Furthermore, there is some renormaliza- 


tion effect. Nevertheless, because the renormalization is small, one is 
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tempted to construct the axial vector current in such a way that only a small 
renormalization would result. To study this question let us write a possible 


Lagrangian for our model universe 


L = % ((vpe)® - mee ]+g Hy-gMoy 


+ nou r(ystO) #+ ey) Hy, Gur + | My . 


For the purpose of this lecture we neglect complications due to isotopic spin. 
The essential point is to get an idea, with the aid of a single model, which 
might be true ina more realistic description of nature. The first three terms 
in the above Lagrangian are those characterizing free meson and nucleon fields. 
The ao term describes the coupling of the nucleon field to the pseudoscalar 
mesons. The final term describes the coupling of the nucleon fields to inter- 
mediate vector bosons. It may be noted that in the pseudoscalar meson-nucleon 
coupling I have used a gradient form, which is called pseudovector coupling. 

For single pion interactions at low pion momenta, this form is equivalent to 

the conventional pseudoscalar coupling Ys “op, providing that one identifies the 
coefficient 2mao with the conventional coupling constant go. For two pion 
interactions there is a great distinction between pseudovector and pseudoscalar 
coupling. The absence of low-energy s wave scattering is compatible with the 
prediction of pseudovector coupling and very difficult to explain with pseudo- 
scalar coupling. For this and other reasons I prefer the pseudovector form, 
and I do not care in the least that the pseudovector form belongs to the class 
of interactions called unrenormalizable. It is very difficult to check which 
of these couplings is more correct because calculations cannot be made for strong 
interactions, except possibly by noting the following fact. If, in any amoli- 
tude, we can extrapolate the pion four-momentum off the mass shell down to the 
point where it is zero (zero momentum and zero energy), then that amplitude 
should vanish in the case of psSeudovector coupling. This is an interesting 
principle which should aid in developing good trial formulae for pion inter- 


actions, but it has not been uSed very much up to now. 
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In electricity we have the coupling of charged particles to the 


electromagnetic field A, in such a way that if one adds to Aa pure gradient 
A 7 A.+8.2X 
a a a 


it makes no difference. Let us pursue the idea of constructing the coupling 
for the weak interaction such that if we add a pure gradient yA, to the vector 
boson field Wo then there is also no effect. For the vector part of the 
coupling there is no change if we have the W meson coupled to a current which 
is conserved, as we have already assumed. For the axial vector term the 


addition of the gradient leads to an extra term in the Lagrangian of the form 
eyld Vays 7, #] - 


Can we cancel this? It turns out that we can, partially, if we use the pseudo- 
vector coupling. Suppose that when we change the W field by adding the 
gradient, we also change the pion field by 

e 


WwW 
5 aa Sega 


In that case the aoldt, vs yov] term will be modified in a way that exactly 
compensates the change in the weak interaction Lagrangian. The compensation 
is not exact because of the mass term in the pion field, and it is impossible 
to produce an exact compensation. However, having lived under the influence 
of Gell-Mann, who likes to suggest that if all the masses were zero there 
would be much greater symmetry in the Lagrangian, I shall disregard the mass 
term, Thus, neglecting the mass squared temn in the free-pion field Lagrangian, 
the change is ~ey/a0 899% to the first order and this is compensated if we 
add a term (e,/ao ) aPW to the Lagrangian. It is in this way that we obtain 
a definite prescription for the direct coupling of the pion to the weak-vector 
boson, or equivalently to lepton currents. Combining terms, it is seen that 


the axial vector current which is coupled to the W field is eye vx isd + 1/a0 3,9) 
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The result of these considerations is to predict that there is a 


direct bare coupling of pions to leptons with a strength 


G nG 


F (theoretical) = ——— = —k 

/8n ao [2m 

This would be the famous Goldberger-Treiman relation if G and go were renormalized. 
G and go, which are coupling constants before renormalization, are unknown. 

Thus the problem is: what can be done with this formula? If, without justifica- 
tion, at this point one inserts for G the axial vector coupling constant for 
nuclear beta decay and for go the experimental pion-nucleon coupling constant, 
then one finds for Fr (theoretical) the value 5.5x 10. This value is not 

very different from the experimental result (5.95x 10). ‘The difference may 
have something to do with inaccurate analysis of the effects of renormalization. 


We now have to discuss why it is legitimate to replace these bare- 
coupling constants in the Goldberger-Treiman relation by their renormalized 
experimental values. To do this we shall have to inquire as to how higher 
order diagrams with many pions lead to a renormalization of the coupling constants. 
There is a clever way to do this, by which one can prove the theorem in a few 
lines. However, Since the ideas of renormalization are not self-evident to 


everyone here, we shall study a more detailed treatment. 


We begin by defining some terms. First, the unrenormalized axial 
vector beta-decay coupling to nucleons, which is Gyny Ys p. (where we have 


assumed that Cal uare) = ~Sv(nare) = -G,), and which is represented by 
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is modified by pion corrections, for example 


so that the resultant effective beta-decay coupling is GAnY,,¥sP- The effect- 
ive coupling constant is Gy only in those cases where the leptons effectively 
carry off zero momentun. When the momentum transfer becomes large, the strength 
of the coupling will be modified. We may indicate this by a function G,(q?); 

G (0) 3 Gy. Theoreticians call the ratio of the bare-coupling constant to the 


experimental one a renormalization constant Zar at ware) = 1/24. 


In our subsequent considerations we shall encounter the following 


That is G/G 


nucleon loop 


together with all the pion correction diagrams, e.g. 


A Ys \ 


The sum of ail these loop diagrams must be a vector. Since the only vector 
around is a3 the momentum transfer at each of the two vertices, then the sum 


must be represented mathematically by aK ). K(q? ) does not blow up as q’ > 0. 
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One would expect that K(q?) is slowly varying over momentum transfers comparable 
to the pion mass, the main variation only occurring for q of order of the nucleon 
mass. Although it is true that there are intermediate states of mass as low 

as 3m, these do not come in strongly in the pseudovector coupling model. For, 
if each of the three pion momenta are of order mo» the coupling is weak (us f? = 
0.08). Strong effective couplings come only for higher values of q, and hence 

not close to the pole with three intermediate pions. Finally, if we consider 


the sum of all diagrams of the form 


: A Ys 


A Ys 


the answer is  K(q? ). The blob stands for the sum of all intermediate states 
except a single pion. 


Let us now go on to consider the renormalization of the nion exchange 


diagram. The simplest diagram giving rise to a nuclear force is 


At each vertex the pion couples through aodyYs >» where q is the momentum transfer. 
There will be corrections to this exchange diagram of two types. First, each 


vertex will be modified by virtual pions, of which a typical diagram is 
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Secondly, the propagation of the pion will not be simply ~~ ——— — —, 
but will be 


eee --Q7.-»-Q-@Q-- r ---G)-O-Q- 


The result of including all these corrections will be the experimental pion- 


’ 


exchange contribution to the scattering amplitude, which is 


1 
2 
Fexp AYs qn — m2 AYs . 


Let us go through this more slowly for the case of pion-nucleon 


scattering. The single bare-pion exchange diagram gives rise to an amplitude 


ad Ars (1/q? - me) firs . 


Suppose we consider still, a single bare-pion exchange but correct the vertices 
to all orders, 


The mathematical contribution to the amplitude is 


Bea ee * hrs(t/ef - md) dys . 


since, in the discussion of the axial vector beta decay we defined [2,(¢ y)" 
to be the sum of all vertex corrections for axial vector coupling. Next, 


if we consider the sum of all diagrams with a single generalized nucleon loop 


breaking the pion exchange: 
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we get 
Beal As soe oad AK oat aa AY é 
2,(¢ ) @-13 


If we add another generalized nucleon loop to the pion propagator, we multiply 


the preceding amplitude by the factor 
2 p2y (2 | 
ad @K(q?) = - 
q - M 


Evidently, if we add up such diagrams with all numbers of nucieon loops in the 


pion propagator, then we are just adding up a geometrical series and the f inal 


answer is 


1 
* hs Pour at ee AYs - 


lz, Lik) mo - ag ¢K(q ) 
We must now compare this result with the experimental pion-exchange 
contribution by noting that the amplitude must have a pole at q = ms where 


m is the experimental pion mass. We see that 


2 
ne = mé + adm? K(m?) or Zs : 
mg 1-ad K(m?) 
Also by expanding the corrected propagator near q’ = m=, we find that 


Cq? - mg - ag’ K(q*)] = [1 = ac K(mZ) - ag’ my K’(m?)] + (q? - m?) 


446 


- 134 - 


so that the exverimental pion-nucleon coupling constant is related to the bare 


pion-nucieon coupling by 


Ag 7 1 
oer “\2 (m?) {1-ad K(m?)- ad m2 K’(m? )] 
AS" . Pr a6. Er "7 


The theoreticians like to call the renormalization factor, V/[ ], for the pion 


zs 
bi 


So much for the discussion of the axial vector vertex and the pion 
propagator renormalization constants. We now turn to an evaluation of the pion 
decay rate in terms of the nuclear beta-decay axial vector coupling constant. 

To get all the renormalization constants straight, we will consider the pion 
decay under the assumption that the pion is produced off a nucleon. Thus, we 
shall consider the matrix element for a neutron to go into a proton, plus a 
lepton pair at a momentum transfer near the pion pole. Assuming that there is 
a direct pion-lepton weak coupling, the simplest diagram contributing to the 


nuclear beta decay which gives rise to a pole at the pion mass is 
p e 
= 


The direct nucleon-lepton term gives no pole so we leave it out. 


Another contribution of the same type is 
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To these two diagrams we must add those diagrams with two bare-pion propagators 


‘oF. Sere” 


and three, and four and so on. Again the terms can be summed because they form 


geometric series. The result of all diagrams of this type is 


G 
ap 1 A(bare) - 
2,(¢) Avs Feat @K(a) [- aoe? OKT Ia rare) | Pe Yathy) - 


In the square bracket the coefficient ~Sa (bare )/2° is the bare coupling of the 


pion to the lepton field, which we found earlier. The term a0 dK d )Gay are) 
represents the coupling of a bare pion to the generalized nucleon loop times 


the axial vector coupling of the nucleon loop to the lepton current. 


Now this sum of diagrams must represent, near the pion pole, the 
dominant part of the nuclear beta-decay amplitude, which we can write down using 
the experimental coupling of pions to nucleons, and the experimental amplitude 


for a pion to decay into leptons. That is, 


Equating the two expressions near the pion pole, one finds 


F =- Sa(bare) I ae K(m?) , 
fom a, Uqlm?) 1 - ad [K(m2)+ m2 K! (mz)] 


ex? 
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Recalling that the experimental nuclear beta-decay constant is 
Gy = Sa (pare 2a) we see that we get the Goldberger-Treiman relation in 


terms of experimental coupling strength, except for a factor which is a ratio 
of renormalization functions 


Z,(0) l-a¢ K(m*) 


Z, (mz) "te ad [K(m?)+ m2 Kk! (m?)] 


Since the functions Z,(q") and K(q?) presumably vary appreciably only for ¢ 
of order my» the above ratio should be quite close to one. One expects, 
therefore, that the Goldberger-Treiman relationship should be quite good. 


Experimentally this relationship holds to eight per cent. 


I shall now describe a more sophisticated way to get the same result. 
Our theory states that, in the limit that the pion mass goes to zero, if the 
lepton current is replaced by a pure gradient, the axial vector amplitude 
should then vanish. Let us now consider the beta decay of the nucleon. We 
would ordinarily write down the amplitude as the sum of two terms. One of 
them is the direct axial-vector coupling: 


= G,(py, Ys n) (ey, a) : 


The other is the coupling through the pion: 


p e 
dS es =a xp(PYs An) = Vm Fay? (ev ay) > 


e 
us 
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In the limit n> 0, the sum of these two must vanish when the lepton current 


(ey av), is replaced by q,, ive. 


C,+ Br F_a 


=O. 
7 exp 


This is the Goldberger-Treiman relation! The reason we apparently do not 


have a small correction is that we have skipped over a small point. If the 
pion mass were zero then the axial-vector coupling constant Ga» and the pion- 
nucleon coupling constant Pues would be different from their true values. 

It is this small change which is represented by the ratio of the renormalization 


constants that we encountered in the more detailed argument. 


We can argue that the coupling strengths would not change materially 
in the gradient coupling theory if the pion mass went to zero. This is 
because pions with small momentum are effectively decoupled. Pions with 
momenta of the order m, are coupled with a strength of only 0.08. Only when 
the pion momentum gets as large as a nucleon mass does the coupling tend to 
15. When the pions have a high momentum, the fact that they have a iass is 
unimportant. Thus, the renormalization factors for the coupling due to pions 
will be the same, to an excellent approximation, when the pions have zero mass 
as when they have the physical mass. It is for this reason that I believe 


that the Goldberger-Treiman relation should be accurate. 


In working with the field theory model described in the previous 
lecture we may notice a certain property of the axial vector current. The 
divergence of this current is proportional to the bare-pion fieid operator. 

In taking matrix elements, the axial vector current has to be renormalized in 
order to get Gy and the pion-field operator suffers a renormalization. These 
renormalization factors are not too different from unity. When this observa- 


tion is analysed it results in the Goldberger-Treiman relation. 


Recall that the matrix element of the axial current between two 
nucleons consisted of two terms. One was a vyts type coupling and another 
term, which arose via a virtual pion, was proportional to q, Ys If we now 


compute the divergence of the axial vector current with m=? # O, we find that 
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the contribution of the second term, the induced pseudoscalar term, goes to 

zero as q? > 0. Consider now the matrix element of the pion-field operator 
between two nucleons. Since the source of the pion field is ig/2M (nd ys p) 
and the source is equal to (OG 2_mZ)@, we see that the matrix element of the 


pion-field operator is equal to 


ize) (Rays p) | 


2M @ - m2 


Equating this expression with the matrix element of the divergence of the axial 


vector current we get again the Goldberger-Treiman relation. 


Gell-Mann abstracted this result by assuming only that the divergence 
of the axial vector was proportional to the pion field va = K%, without getting 
the constant of proportionality, K, from my specific theory. He then calcu- 
lated the divergence of the nuclear-axial vector matrix element in terms of K, 
and the decay rate of the pion in terms of K. If one considers the ratio of 
these two amplitudes the factors of K drop out and one gets the Goldberger- 


Treiman relation. 


More recently Gell-Mann has refined the argument by dropping all 


references to field operators, but I shall not give that argument. 


Introduction to SU; 3rd LECTURE 


It is time, in our study of the weak interactions, to consider those 
decays which violate the conservation of hypercharge. Some understanding of 
these has been obtained with the help of SU; symmetry considerations and we 


therefore turn now to a short discussion of SU3. 


Let me begin by reviewing isotopic spin or Sl,. Suppose we have 
two particles with identical dynamical properties as far as a large group of 
interactions are concerned, e.g. the strong interactions, but which differ when 


another type of interaction, e.g. electromagnetic, is considered. Let them 
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be a neutral particle A = A°, and a negatively charged particle B = B , and 

let us assign to each of them an eigenvalue for the z component of isotopic 
spin: +4% for A, and -% for B. The particles are assumed to have correspond- 
ing antiparticles A and B, the quantum numbers of which are the opposite to 
those of the corresponding particle. By considering the states of two objects, 
one can generate states with isotopic spin 1 and 0, from these isodoublets. 

For example, if we consider new basis states A and B which are related to A’ 

and B’ by unitary transformations, then the state of an antiparticle anda 
particle, which is unchanged by the transformation, is (AA+ BB)//2. That 
state is therefore an isosinglet; i.e. it has I = 0. Under the unitary 
transformations, the three other orthonormal states AB, (AA-BB)/2 and BA, are 
transformed into linear combinations of each other, so that they form an 
isotopic triplet, i.e. I = 1. Let us call these three states by a new set of 
names so that we recall immediately that they form a triplet; in analogy with 
the 2 hyperons we use the symbol oa: 


o = -AB, o° = -(AA-BB)//2, o° = BA 


and in analogy with the A° hyperon we designate the isotopic singlet state 
by 4°: 


r° = -(AA+ BB)/A/2 . 


Two basic objects allow us to construct all states which differ in 
their isotopic spin properties. However, in order to construct states which 
differ in another quantum number, say hypercharge, we have to introduce a third 
basic object. Thus, we add to our set of A and B a third particle C = Co. 
which has the same dynamical characteristics for a restricted set of inter- 
actions and which is negatively charged. We define its isotopic spin so that 
C forms an isotopic singlet. Under unitary transformations generated by 
isotopic- spin operators, C is not mixed with A and B. We define also a 
quantum number, e.g. hypercharge, which is -1 for C and 0 for A and B. With 


three basic objects it is interesting to see what happens when, in analogy to 
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the procedure for isotopic transformations, we study the effects of unitary 
transformations on states composed of several of the three particles A, B, C, 


and the corresponding transformations on their antiparticles A, B and C. 


If we concentrate on the states formed from a particle and an anti- 


particle, of which there are nine, we find that the state 
a = (AA+ BB+ CC)//3 


is unchanged by such transformations and thus transforms as a singlet, whereas 
the other eight orthonormal states are transformed among themselves by a general 
change of the three basis states. The other eight, therefore, form an irreduc- 
ible octet, which we display below: 


Chart 1 -(AC) (Bc) 


é €° 
-(AB) 14/2 (BB- AA) (BA) 
o o° o* 


- 1/6 [AA+ BB - 2(CC)] 


pte 


n Pp 


The sign factors assigned to these states are chosen by a convention which is 
an extension of the Condon and Shortley specification. We have arranged the 
states so that a state gets transformed only into states in the same row, when 
the unitary transformations mix only A and B, that is, when we restrict our- 
selves to isospin transformations. The states are labled € , n, o” etc. in 
analogue to the real particles =, N, at etc. as they have the same quantum 
numbers. If the SU; theory were perfect, then the particle states (or the 
corresponding particle states of other octets such as Ke Kes 7, etc.) could 


be substituted for these small letters € , n, o” in any expression written 
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below without changing its transformation properties. Likewise antibaryons 


could be substituted with the corresponding quantum numbers, i.e. -p for € "; 


n for €°, X for X, -o* fora, o° for 0°, -o for o, -€ for p, é° for n. 


The above octet chart is written down immediately just by requiring that the 
states have the right quantum numbers. The coefficient -2//6 for €c in A 


is determined so that the A state is orthogonal to the singlet. 


Sum rules for mass splittings 


We can generalize the scheme slightly at this point and thus obtain 


the first order mass sum rule. For this we assign masses to the C and C 


particles which are different from K/2, where we take K/2 to be the mass of 
A, A, Band B. The generalization consists in not identifying A, B and C 
with the antiparticies of A, B and C, but just requiring that A, B and C 


transform in the same way as the antiparticles. Thus, the mass of C can be 


different from that of C. We designate the mass of C by K/2+a-b, and that 
of C by K/2+ a+b. 


The other virtue of not necessarily identifying A, B and C with the 


antiparticles of A, B and C is that then the € is not the antiparticle of p 


and so on. We will assume that the mass of the composite states is just equal 


to the sum of the masses of the component states. Carrying out this simple 


calculation we find the following table of masses: 


p= K+bta 

== K-bta 
tak 

A = K+2/3+2a. 


Eliminating a and b leads to a sum rule for the masses: 
3M, + M, = 2My + 2M, . 


Note that the b term breaks the mass symmetry for different strangeness, so 
that if we were constructing an octet of mesons using this scheme we would 


identify A, B and € with the antiparticles of A, Band C, and thus set b = 0. 
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Incidentally, it is worth noting that because the masses are no 
longer equal, the mass operator connects the singlet state with that state of 
the octet which has I = 0 and Y = O. It is thus not diagonal. If we 
diagonalize it we find new eigenvalues and new eigenvectors which are useful 


in the interpretation of w and ¢, two of the vector mesons. 


Octet operators 


For the analysis of beta decay I would like to find the total 
isotopic-spin current in terms of my basic set of objects. In fact, with 
three basic objects I will get eight components of a generalized current 


instead of just the three that I get by considering isotonic-spin transforma- 
tions. 


The chart of the eight particles given on page 143 also permits us to 
discover a set of eight operators which transform like an octet. It is only 
necessary to read the A, € etc. as annhihilation of A, creation of C (or 
creation of -A, destruction of C) etc. Thus, the p-like operator (the 
operator which transforms like p) is CA or annihilate A, create C plus anni- 
hilate C create -A. That is, the result of the p-like operator on any state 
is found by taking each term and rewriting it with each A replaced by C, and 
adding what one gets with each c replaced by -A (terms with neither A nor c 
in them are to be dropped). For example, the p-like operator on n = (CB) 
converts it to -AB ora. This we write as +(o n). Again the p-like 
operator on p itself (CA) converts it to (CC- AA) which is Sh of +f 2, 
which we write as elie (0° p)+ J Q°p). Proceeding in this way we find the 
effect of each operator on each member of the octet, with results given in 


the following chart: 


455 


-~ 143 - 


Chart 2 


P operator = ~(€-0° )~ (0° p)- /3 (E2) -f3 (Xp) - /2 (E% 0") ~ /2 (on) 


1 rae $4 a ane os fas ea) 
operator — Fal? « a )+/2 (o"p)+ (o°n) = (€90°) + /5 (€9r) - 73 Gn) | 


+ 


$ operator = “e[ -tie)- (€-€°) ~J2 (0°0") -/2 (070° | 


! E%¢0 - 67 = bee “+ + a ee 
Oe operaton "ae (€°€°) - (€ € )+ (pp) - (nn) + 2 (oo )- 200 ) 


* operator ~ | Cn) (€°€7) + /2 (0°07) + V2 (oro° | 
zi eg” 2020 - 2 
* operator le (Fe") ~ (EE) + (in) + Go) | 


C gubrater * i aa | ao )~ (0°€°)- fF (o €7)+ J (B o*) + JF (KEP) - 15) 
12 


© Gperator he )+ (po )+ /2 (0°? + fF (Ro) + V3 (KEN )+ v5 Gr) | 


Note the operators O12 GF and o_ are the isotopic spin operators: -I_,; I 


Z 
and I, » respectively. 


You will note from these tables that the a operator transforms 
like the lowering operator for isotopic spin and is thus the operator we 
would use for the non-strangeness changing beta-decay current, at least for 
the vector part. Similarly, to describe strangeness changing weak decays, 
the corresponding operator that we would use would be the operator correspond- 


ing to p, which, as you will note, automatically implies that AS = AQ. 
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Thus, SU; determines the reiative coefficients in the beta decay for us. It 
turns out, however, that there is another octet of operators, which we shall 
dispiay later, that may be used for the beta-decay currents. The use of 
these will introduce another parameter into the analysis of the weak decays, 
but will not change the isotopic and hypercharge selection rules that are 


apparent from the octet we have just derived. 


The use of such Currents to describe beta decay is just a guess 
and we shall try to check it out in later lectures. This, in fact, is 


Cabibbo's theory of weak decays. 


We have called this set of eight operators an octet. In fact, 
it is an octet in the sense of SU3 because if we make transformations among 
the three basic elements used to construct our scheme, the set of eight 
operators will transform into each other in exactly the same way that the set 


of eight composite particles transform into each other. 


Reduction of the direct product of two octets 


Combining two particles with octet transformation properties pro- 
duces 64 possible states which, under the various operators listed above in 
Chart 2, are transformed into one another. From these 64, multiplets may 
be formed whose states transform only among themselves. These multiplets 
contain ft, 8, 8, 10, 10 and 27 members. Let the state of two particles be 
written as pA, say, meaning the first is p, the second is A} in its trans- 
formation properties. (For example, the first may be an anti =; the 


second, 7°.) The state Apmeans the first is A, the second is p (in our 


example, the first is anti A, the second is ne We shall write, for short 


t 


(pn) 
[pn] 


(pn - np) 


(pn+ np) . 


< = eae aa “Fe + 
Since (nn)+ (pp) + (€ € )+ (€°€° )+ (oa )+ etc. is evidently invariant, 
replacing each antiparticle by a particle of the same transformation properties, 
we get that €°n-€ p+ n€°- oo ete. is invariant. Hence, our normal ized 


singlet is 


457 


~ 145 - 


SINGLET 
{Lé~p] - [een] + [o%07] - 020° - va f //8. 


The forms for the p-like operator etc. in Chart 2 permit us, in the same way, 


to find an octet called the antisyumetric octet. 


We denote the states by their quantum numbers {Y,1,1,} within the 


families. 


ANTISYMMETRLC OCTET (8,) 


it.%es%e} = {(po°)+/2 (o7n)+ /5 (pr)}/ 12 
{t.%s-%e} = {(o°n)+f/2 (po )+/5 (ndr)}/ f12 
{o,t,1} = {(pé°)+ /2 (ao i/o 

{0,1,0} = {(pé7)+ (n°) +2 (070 )IA/12 
{0,t,-1} = {(né~)+J2 (0a i/o 

{0,0,0} = {(n€°)+ (pé~)}/2 


[-1,2,'2]} 


{(€°0° )+ /2 (oe) 4 JB (2E° YA/12 
{f=t,% hey = [C0 ) + /2 (6207 4 JB (XE DI A/12 


The other octet is composed of states which are symmetric under the exchange 
of the two particles, 
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SYMMETRIC OCTET (8,) 
it,4.%} = (3 [po°] -[pa] - 6 [no] 3//20 
[t.%e-%] {-/3 [no°?] - [mr] +6 [po] }//20 


{o,t,t} 


{-/5 [pé°] + /2 [ro" ]}//10 


fo,1,0} 


{2 [ro°]} -/3 [pé7] - /3 [n€°] 3 //20 
fo,t,-t} = {¥2[x07] -/3 [ne] }//10 


{0,0,0} {Lo°o°} - [xa] = 2 [oo] + [pé7] - [n€°] }//20 


t-1,%,%} = (-[re°]-V3 [0°69] + VO [0° E7] 1 //20 
[-1,%,-"%2} 


{-[ag7] + /3 [0°67] - Yo [0°] }//20 


The remaining twenty states which are antisymmetric under exchange 
of the two particles fall into two multiplets of ten states. These deciments 
contain an isotopic singlet, doublet, triplet and quartet, and the hypercharge 
of the isotopic multiplets differs by one unit in a progressive way. The 
two decimets are related by a reflection transformation called R, under which: 


Y>*-Y, lol, I> I, and p.+€ nt Spo Ho, kod. 


DECIMET (10) 


" 


{1,% a (o' p)//2 
{1,%,%} = [/2 (o°p)+ (o nie 


{1,%,-%]} 


iT 


{/2 (o°n)+ (o p)i{/Vo 


{t,% h | (o n)//2 


" 
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{0,11} = {(o70°)+/2 (Ep) +J3 (oA) A/12 
{o,1,0} = {(o%0")+ (Ep) + (€°n) +3 (0A) /A/12 
fo,t,-t] = [(o2o )+ V2 (En) + V3 (oA) //T2 


[-1,%4,%} = [(€%0° )+ 2 (0° ED 4 JB (EAN 12 
{-t,%s-e} = [00% 7)+ /2 (C90) 4 f5 (EA) AID 


{-2,0,0} 


a 


(6°E )//2 
DECIMET (10) 


{2,0,0} = (np)//2 


{1,2} 
{',"% sh} 


{(o°p)+ /2 (no” )+ J5 (prjA/12 


" 


{(no° )+./2 (o p)+/3 (m)j}//12 


{0,1,1} = [(0°o" )+J/2 (pé°)+ JB (oA) //t2 
{0,1,0} = {(o 0" )+ (pé-) + (n€°) + 3 (o°A)I//t2- 
{o,t,-t} = [(o 0° )+/2 (n€~) + /3 (0 A) //12 


[-1,%25,%2} = (o%E°)//2 
[-1,% 5%} = W/2 (0°€°)+ (oe AVE 
I" Zesr'h} = (2 (0°) + (0 ENO 
I-te hl = (OE WR 
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The remaining twenty-seven states form a multiplet which is symmetric 
under the interchange of the particles. It is composed of a triplet with 


Y = 2, a doublet and a quarter with Y = t, a singlet, a triplet and a quintet 


with Y = 0, a doublet and a quarter with Y = -1, and a triplet with Y = <2. 
(27) 

{2,t,t} = pp 

{2,1,0} = [np] //2 

{2,t,-t} = nn 


hs) = po Ie 
{t,%5%} = £/2 [po2]+ [no ]j//O 
[t,%.-%} = /2 [no] + [po ]I//6 
L,%s-h} = tne I//2 


{1,%,%} {[po°] + 3/3 [pa] - /2 [no™] }//66 


t,%,-%} = {-[no?] + 3/3 [nd] +/2 [po] }//60 


{0,2,2} 30 0° 

{0,2,1} = [oo J //2 

{0,2,0} = {Loto ] + [oe }}//e 
{0,2,-1} = [oo ]//2 

{0,2,-2] soo 
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fo,t,t} 


it 


{/2 [pé°] + V3 [o*A]}//t0 


{0,1,0} = {[pé7] + [me] + /3 [o° a] }//t0 
fo,t,-t} = {2 [n€-] + /3 [o a] f//t0 
{0,0,0} = {3 [pé-]-3 [n€°] - [0707] + 0°0° + 9 2A} //120 


[-1,7%2,7%2} 
{-1,%,'2} 
{-1,7% Ye} 
[-,% sh} 


[ore A/a 
{v2 [o°€°} + [oe ]1//o 
{/2 [o°é-] + [o E° {Ae 
[oer 


" 


{-1,%4,%} 
{-1,%.-%} 


[72 [o*e7] - [0°€°} + 3/5 [nE°]} 00 
{a/2 [o°€°} + [0°€7] + 3/3 [AE TI A/60 


{-2,1,t] €°E° 


{-2,1,0} 


[eee IA/2 


{-2,1,-1] ¢ € 


One way to deduce these multiplets is to take an extreme case, say 
pp, which has isotopic spin { and hypercharge +2, and operate on it with all the 
raising and lowering operators, which in fact are just the octet of operators that 
we found at first (Chart 2). Doing this, we would get the multiplet with 27 objects, 
Next, we can take a state with extreme quantum numbers which is orthogonal to 
the member of the 27 with the same quantum numbers. We then follow the same 
procedure of generating the complete multiplet by use of the raising and lower- 


ing operators. By iteration of this procedure we will generate ail the mlti- 
plets. 
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Couplings of the baryons and mesons 4th LECTURE 
Last time we described how one cvuid construct states of two 
particles, each one of which was a member of an octet, so that the resultant 
states would be grouped into multiplets. we found a singlet state, two 
'8'ts, a '10', a '10' and a '27'. The two '8''s can be distinguished by 
means of a certain type of inversion transformation which we define in such 
away that pe é, n# €°, ¢ eo, oo & o and kA. Under this transfor- 
mation, the first '8' we constructed was antisymmetric and the second ‘8! 


was symmetric. 


One of the uses of the charts of the previous lecture is to con- 
struct couplings of the baryons and mesons which will be invariant under 
unitary Spin transformations. Although this is not directly related to the 
weak interactions, I would like to pause a moment and discuss that application 
of the charts. From the construction of a unitary singlet from two '8''s, 
we know that the combination -1 0-1 a + none Ke K p+ K°€°+ Ken+ nh is 
an invariant. We stated before that you can construct certain combinations 
of two particles that transfurm like an octet. For example, if we substitute 
foro’, o , o° the antisymmetric o , o , o° operators derived in the last 
lecture, then we get all the coupling coefficients of the pseudoscalar mesons 
to the nucleons. Let me write out part of this. To keep the normalization 
the same, the coupling constant will have to be Jé a, where a = g/2M. 


Sea at JE vst) + Sh (Fvsh2 + Jee rsh8*) + 


( gece 


+ l/s B-ysGZ° | +K~ i i (PrsdB*) + 
te 
+ Le ysdn)+ (EZ ysda)+ % Cysdp)+ 


th. ae 1 = 7 2) 
+ = (5° 3A j+—= CE sAb°) be hay 7 
ag i 5 
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However, the coupling is not completely determined because one is also allowed 
to use the symmetric set of eight pairs of baryons. Since there are only two 
'gtts, there are thus only two constants which characterize the meson-baryon 
coupling. It is standard to call the coupling constant of the normal ized 
aniisymmetric octet of baryons to the mesons Joa F, and to call the coupling 
constant of the normalized symmetric octet of baryon vairs ./10/73 aD. The F 
and D are two parameters which must be discovered by comparison with experi- 
ment. There are rough indications from the study of A hyperfragments that 


the ratio D/F is positive and is of the order two or three. 


SU; in the weak interactions 


Returning to the weak interactions we find that we are faced witha 
slightly more difficult problem. We cannot make use of SU3s symmetry to con- 
struct invariant weak couplings, because the weak Lagrangian must destroy 
conservation laws. More ingenuity is therefore required to construct the 
couplings yoverning the weak decays. As before, one can only try to guess 
the answer and see if experiment agrees with that guess. The most satisfactory 
guess that has been made to date is that of Cabibbo, which I shall now introduce 


ina way that seems most reasonable to me. 


Let us consider first, the non-strangeness charging piece of the weak 
interaction current, because it is this piece that we think we know best of all. 
We expect from the conserved vector-current theorem that this part of the 
current should be the isotopic-spin current. The isotopic spin current is 
just the € nies which is the antisymmetric-octet operator. Considering stili 
the vector current, what would a logical extension of this idea be in order to 
cover strangeness Changing decays? There is some evidence in leptonic decays 
that this part of the current must violate isotopic spin conservation in such 
a way that only a violation of a haif unit is possible, 4I = %. For example, 


in the decays K + 7+ leptons, the prediction of the AI = % rule is 


° F . + . 
Ke Bt TEBLONS ©9571) Sredietion = 2 experiment. 
K + 7+ leptons 6-24 0°9 
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There is other evidence for this rule but it is not very definitive. Never- 
theless, the selection rule is the standard guess in the construction of the 
current. On the other hand, there is much better evidence for the selection 
rule AS = AQ. Assuming that the strangeness changing piece is also a 

member of an octet there is only one choice for the current, namely, the 
operator that transforms like the proton. From our charts you will note that 
the antisymmetric proton operator obeys both the AI = % and the AS = AQ 
selection rules. We are led, thus, to propose that the vector current is 

a” sp 


.¢+ ave 
anti anti 


Experimental ly the leptonic decays in which strangeness is violated 
are 20 times weaker than the non-strangeness changing decays. Therefore, the 
coefficient s, which measures the proportion of the strangeness-changing current 
must be definitely smaller than ft. The K > me v is also somewhat smaller than 


might be expected. We will use this rate later to determine quite accurately 
the value of s. 


Universality and the strangeness changing decays 


How do we reconcile weakness of the strangeness-changing decays with 
the concept of universality. In our study of yp capture, nuclear beta-decay 
and muon decay, we found that np and the leptons were all ccupled with the same 
strength within a few per cent. This observation, in fact, led to the idea 
that there was a universal coupling characterizing the weak interactions. 
However, today we know for example, that p cuuples much more weakly. Can we 
construct a theory of the coupling which retains the idea of universality? 

The answer is yes, and the way to do it was shown by Cabibbo. It is simply 
that we assume that the weak current is not the component a nor the component 
p, but rather a skew component, but still normalized, i.e. we assume that the 
vector current is 


+V é Vv 
cos 9 Santi? sin C) Ponti 
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I have added the superscript V to designate the vector piece of the weak- 
interaction current, The idea then, is that the skew component is coupled 


to the leptons with the universal-coupling constant Gis 


We turn now to the axial current. If there were no complications 
due to the strong interaction, the obvious choice for the axial current would 
be to use the same isotopic-coupling coefficients, and merely insert a Ys. 

In other words, if we could deal with bare particles, the axial vector part 
of the current would be 


+A we. op A 
cos Uv Oo .+ Sind p 
an 


anti ti ’ 


where the A superscript is to indicate the Dirac matrix Ys + But, of course, 
life is more complicated. If SU3 were nerfect, any component of the anti- 
symmetric octet of operators would be conserved. Thus, the vector current 
would not be changed by renormalization. In fact, Gatto has shown that even 
to the first order in the mass shifts, the vector current is not renormalized. 
The axial vector current, however, is not conserved, even in the limit of 

exact SU; symmetry, so one must expect modifications due to renormalization. 
This means that the strength of the axiai vector current will be changed and 
also that, starting from the antisymmetric octet for the axial vector current, 
we can get both the antisymmetric and the symmetric octets in the renormalized 


axial current. Thus, the renormalized axial vector current will take the form: 
i: +A Aj oe +A | 
e A \ . QO 
J6 Faw eee Oo it sine Panti |* ¥10/3 Deny = rs) Ocymn* S10 9 Psyinm | * 


In the limit of SU; symmetry the angle » is independent of the particles which 
are coupled to the weak current, because renormalization cannot change the axis 
of the octet that is coupled with the weak decays. The renormalization 
coefficients, F and D are strongly dependent on the type of octets, i.e. pseudo- 


scalar mesons, baryons and vector mesons. 
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Comparison of the Cabibbo theory with experiment 


Let us turn now to a discussion of how weli this proposal of Cabibbo 
agrees with experiment. The most important parameter of the theory is the 


angle 0. One of the best ways to determine it is to consider the decay 
Ki on +e ey. 


Two pseudoscalar mesons in a J =! state have a parity of (-), and therefore 
only the vector-weak current is involved in this decay. We have two vectors 


Py and Poe to make the current, The most general form of the matrix element 


is 
(Py + Pi+é (Pee Po) . 


In the limit of unitary symmetry the € term would be absent. Experimentally, 
a study of the spectrum shows it to be very small and therefore we will drop it. 
Putting in the coefficient 1//2 from the K 7° term inthe p-like current and 

the coupling strength /2 G, we get for the matrix element: 


G sin®9 (Py Py) (vy ae) . 


Making the calculation of the rate and comparing with my data (which may not 
be the latest, most accurate) I find that |sin 6| = 0.234 0.015 (Cabibbo, 
using different data in his paper, found |sin o| = 0.260+ 0.015). 


Another way to determine the angle which Cabibbo also proposed, was 
to look at the ratio of K and a decays into leptons. Both of these decays 
proceed only through the axial vector current because the K and m are pseudo- 
scalar mesons. The geometric structure of the matrix element is, as we have 
already discussed, [8a Faia, Av) and J & Fairy av)- 


There is only one octet that can be made from a single set of mesons. 


Hence, the Cabibbo theory would state that FK/pa = tan 9.. Experimentally 


8 6 
pF = a22xt0 ong Fe 2eeox to 
TT m K m 
T K 


, 
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so that tan 9 is very nearly equal to m_/my + Numer ical ly |sin 6| from the 
experiments is 0.26+0.02, which agrees with the first determination. This 
provides the first check on the Cabibbo hypothesis. (Incidentally, in my 

later calculations I shall use [sin e| = 0.245, because then sin? 9 <= 0.06, 


a simple number for computational purposes. ) 


Another consequence of the Cabibbo theory is that the nuclear beta- 
decay vector coupling constant Gy» should differ from the muon beta-decay 
coupling oi In fact, G, = iF cos 9 in the Cabibbo theory. Since cos 9 
is about 0.97, we see that this theory very possibly resoives this difficulty 


in the theory of weak interactions! 


Determination of the axial vector current F/D ratio 


The next question is how well the Cabibbo theory fits with the 
leptonic decays of the hyperons. To study this let us make a table of the 
leptonic decay matrix elements, and use our charts of the F and D octets to 


construct the proportions of vector and axial vector currents in each decay. 


Non-strangeness changing decays. These are to be multiplied by cos 9. 


Reactions Vector Axial 


n> ptetv 1 F+D 


Bo + B+ e+ 0 J/2 -/2 F 
Eo + 2° +et+y J2 /2 F 
S +Ate+v 0 7 D 


ay + Atetv (0) J D 


= + e+ et vy t (F-D) 
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Stran,eness changing decays. These are to be muitiplied by sin ¥ 
Reactions Vector Axial 


n° + prety 


an ft. 

/2 2 

A> ptetv Sh Th (Ft % D) 
B 

Ph 


EB + Do +e+y A (F+D) 
= +At+ery J (F-'% D) 
s+ D+ ety 1 (F+ D) 
EB +nte+yv 1 (F-D) 


The best relation we can use to partially determine F and D is the 
fact that for the decay of the neutron the axial vectur coupling is t.20+0.04, 


relative to the vector coupling. Thus, 
F+D = 1t.2010.04. 


Strangeness~-changing decay rates can now be used to determine other 
relations between F and D. There is a standard rate for these decays pre- 
dicted under the assumption that the weak current is just the matrix element 
of ~ATt iys). This rate is called the universal Fermi interaction (U.F.I.) 
rate. Neglecting smail relativistic corrections, for any of these decays 
the rate should be (V7 +3 A? /4) times the U.F.I. rate. The experimental branch- 


ing ratios, together with the U.F.I. predictions, are given in the following 


table. 
Decay U.F.I. Experimental branching ratios 
A+ pte+y 1.5x 10° 0.8t+0.10x 10 
E> N+e+7 5.8x 107? 1.374 0.34x 107? 
Bo +Ate+y t.oxto- 0.07+ 0.03x to 


2 o>A+setv 0.6x to” 0.074 0.04x10* (4 events). 
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From the rate of decay of the A, we find 
F+ 4D = 0.684 0.07. 


There is evidence from the polarization of this leptonic decay that 


F+%D> 0. Using this relation, together with the first, we deduce 
F = 0.407 0.10, D = 0.7840.12. 


Cabibbo's original values were F = 0.30, D = 0.95. The smali difference 
results from slightiy different data and from the fact that Cabibbo used 1.26 
instead of 1.20 for ~G,/6y 


With F and D thus determined, do the other leptonic decays agree 
with Cabibbo's theory? In many cases only a few events of a given decay 
have been Observed so that the statistical errors and, even more important, 
the systematic errors in the determination of the rates, are very large. 
Of the remaining decays, for which experimental information is available, 
perhaps the reaction = +n+e+v is known to the greatest accuracy. Con- 
verting the experimental rate for this reaction, one finds | F- D| = 0.40+ 0.16. 
You will note that this is certainly consistent with the value F-D = -0.38+ 0.22 


which was predicted above. Another prediction is that the reaction 


= + A+e+ 0 should have a branching ratio of (0.50+0.05)x10. Experimentally, 
the branching ratio is two or three times to, but this is based on only a 
handful of events. The predictions on the 2 + A decays are also consistent 


with the crude experimental values. 


The D to F ratio for the axial vector current is 2°2+ 0-8. This 
is the same ratio as is claimed for the coupling of the pseudoscalar mesons 
to the baryons, which is thought to be about 2, 3 or 4. This value came 
Originally from very crude theoretical analysis of the strong interactions and 
some study of the AN interaction in hyperfragments. I may remark here that 
an extension of the Goldberger-Treiman argument to the strangeness~changing 
current will result in the prediction that the D to F ratio for the strong 
interactions should be the same as it is for the axial vector weak baryon- 


baryon current. 
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Generalization of the Goidberger-Treiman Relation 5th LECTURE 


One can explore the question of a supposed equality between the 
F/D ratios in the strong interactions and in the axial vector weak current by 
studying a generalization of the Goidberger-Treiman relation. There are 
several ways to obtain this generalization. One of them would proceed in 
analogy with my first derivation of the G-T relation for pions. To do this 
I would extend my modei so that it had eight baryons and eight pseudoscalar 
mesons coupled to baryon pairs using the yseudovector qdys coupling. I would 
then couple both the baryon pairs to the lepton current in such a way that if 
I neglected the meson masses a divergence added to the lepton current would 
have no effect. In complete analogy to the pionic case, I would have in 


lowest order the relation 


Bare 
F 2 a(n a) 
K(unrenormal ized ) Jen Qo (nAk) 
n 


Renormalization corrections would have to be made of course. It 
is clear that the structure of the renormalization in the K case would be 


the Same as it is in the 7 case. That is, 


os Sa(na)  ZA(mAK) = = ad (a axy Kom ) 
Jen 2(n AK) ZA(nAK) (my) Vad (n ak) Km) * mye Ki (mé | 


2 2 
1 80( nak) K,(my) 


= FeCunrenormalized)* 1, a 
it 2 2 2 
1 20 (A AK) (kK, (me) + my Ky (my »] 


One can then attempt to argue that the ratio of the renormalization factors 
is close to one. This will be far less convincing, however, in the case of 
the K because my is far greater than m?, and me is much nearer to the next 


threshold for intermediate states than mr was. In the case of the pion, 
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the distance from q? = 0 to the pole is mn, whereas the next smallest energy 
denominator lies at a distance 9 m=? from q = 0. On the other hand, for the 
K, q? = 0 is (500 MeV)? away from the K pole and only about (770 MeV)? away 
from the Kaa intermediate state. Thus, in the 7 case we had a factor of 9 
in our favour, but in the K case this factor is only about 2. On the other 
hand, with pseudovector coupling, as we explained before, low momentum pions 
(and kaons) are only weakly coupled since f? = 0.08 is smaiil, so these 
"nearest pole terms" are not expected to be large. I suspect that the func- 
tions Z(q?) and K(q?) only begin their serious variation near q? equal to the 


nucleon mass squared. 


Granting the validity of these arguments that lead to the Goldberger- 
Treiman relation, the following point can be made. In the limit of SU; 
symmetry Fo tan @ should equal Fue at least if unrenormalized. But the 
renormalization factors of F exhibited in the second form of the above equation 
should also be nearly the same. This explains why we can get tan 9 from com- 
parison of myuv and Kuv decay. This being the case, the first form of the 
equation shows that the experimental meson-baryon couplings should be propor- 
tional to the experimentai axial vector beta-decay couplings. Another way 
of stating this (since it is assumed that both types of couplings transform as 
an octet) is that both octets must have the same character. Thus, assuming 
the Goidberger-Treiman relation, one finds that F/D for the weak axial vector 
current is the same as F/D for the meson-baryon coupling. Aithough there are 


indications that this may be true, there is as yet no sharp experimental test 


available to check this prediction. 


Leptic Decays 


Let me summarize what we know about the leptic decays. If there 
is no strangeness change, we have essentially a compiete theory of ail the 
interactions with two coupling constants, Gy and Ge Fo is supplied via the 
Goldberger-Treiman relations. When we come tu the strangeness-changing decays, 


we do not know whether we have a good theory or not until we gather more 


47) 
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experimental data. We do have a theory, the Cabibbo theory, which is con- 
sistent with the data now available, but the sharpness of the tests is not 
very great. Cabibbo's theory involves two more parameters: sin 9 and the 
F/D ratio. If some day we can justify more adequately the Goldberger-Treiman 
relation, then the r/D ratio would come out of a strong interaction theory. 
The Cabibbo theory would then involve only the parameter 9. All observed 


lentic decays then have theoretically predictable properties, with the excep- 


tion of K*> 7+7+ e+. 


Non-leptic decays 


The non-leptic decays also arise from the assumption of a current- 
current interaction. If we take the theoretical view that the current is 


the sum of pieces of the form 
J = (yn ee (ev)+ "g*" cos 0+ "po" sind , 
then the quadratic terms in the expansion of J-J 
= + 2 = + 2 
oo cos° 0+€ p sin* @ 


give rise to non-strangeness changing, non-leptic, weak interactions. It 
is these terms which give rise to a parity non-conserving nuclear force, for 
which evidence has been acquired at Cal Tech by Boehn. Other than by 


experiments of this type, such terms are difficult to get at experimentally. 


The cross-products in the expansion of J-J are [o p+ ga] sin 9 
cos 0. The term op accounts for those decays for which AS = +1 and the 
term Eo", the hermitian conjugate term, accounts for those decays for which 
4S = -l. Since one is the adjoint of the other, we need consider only one 


of them. 


We have a much more difficult task in making predictions for the non- 
leptic decays because the renormalizations are much more elaborate. In 


other words, we have a four-fermion coupling term which is much screwed about 
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by renormalizations of the strong interactions. In the theory of the leptic 
decays much was predicted on the basis of the hypothesis that the weak current 
had certain transformation properties. For the non-leptonic decays, on the 
other hand, the form J- J does not have weli-defined transformation properties, 
with regard to unitary spin or even with regard to isotopic spin. Because 

the situation is not very clear, let us concentrate at first on the transforma- 


tion properties of the currents. 


The strangeness changing decays result from the coupling of an 
isotopic-spinor current with an isotopic-vector current. With respect to 
isospin, therefore, the interaction J-Jhasa piece which transforms like 
isotopic spin " as well as one of isotopic min %. In the AS = ian decays, 
therefore, our current-current interaction leads to the selection rules 
AI = % or JI = %. The first thing that we must check is whether there is 
any evidence for AI = %% in these decays. If one looks at the data to check 
this, one is surprised and finds a terribly interesting fact. Aimost all the 
data seem to be consistent with only AI = vA terms in the amplitude. Very 
small AI = 74 amplitudes are required to fit the data, and no AI = % is 
required at all. The fact that AI = % was so smaii was totally unexpected 


and has no explanation theoretically on the basis of the current-current 
hypothesis. 


Tests of the AI = ¥2 rule 


In the absence of being able to give a satisfactory theoretical 
justification for this rule, I will discuss the data which substantiate it. 
Consider first the disintegration of the A. There are two non-leptic channels 
open 


A+pta,A7% nen? . 
On the basis of the AI = % rule, since the isotopic spin of the A is zero, 
the charged piou decay amplitude should be /2 times the neutrai-pion decay 


amplitude. Hence, the rates should be in the ratio 2 to 1. If there is a 


decay amplitude into the isospin % state of the pion-nucieon system, measured 


by a coefficient a, then the ratio of the two rates should be |(.,/2+ a)/(1-/2a)|?. 
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The exnerimenta: brancning fraction for the decay into nt+ta° is 0.35 +4 0.02 

of all non-levtic decays. The data, thereiore, indicate complete azreement 
with the Al = Y% rule and show that [al < 0.05. This excellent evidence for 
the AI = % rule is substantiated by the fact that the polarization properties 
for the neutral and charged modes are tive same. The ratio of the s to p-wave 
amplitudes is measured by the anisotrony of the direction of the pion with 
respect to the snin of the A. The coefficient of this anisotropy term is 
called a, and if you check the measured vaiues of a for the two modes, you 


will find that they agree within the experimental errors. 


The next baryon to consider is the 2. There are three non-leptic 
decays 


1) as p+a° 
2) By > ned 


3) 2 +ntea. 


If we label the am,litudes for each process by A;, Az and As respectively, it 


turns out that the only prediction of tne Al = % rule is 
f/2 A, = AstAs 


Now each ainylitude A, is a two-component vector, one component for the s wave 
and one component for the p wave. The rates are measured by the squares of 
the lengths of these vectors and are almost ali the same. This means that 

the three vector amplitudes must form an isosceles right triangle with /2 A 
being the base of the triangle. Since the anisotropy coefficients for the 
second and third decays are aimost zero, both amplitudes lie along the 
coordinate axes in an s, p-plane. Therefore, the first amplitude should lie 
at 45° to tiese axes. The anisotropy coefficient a, should therefore be 1 

for the reaction. Experimentally, a = 0.73. This fact is slightly annoying, 
but it is not sufficiently bad that we should consider it as evidence against 


the Ail = WA rule. This measurement is not a very good test of the AI = % 
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rule because the experimental errors could be large. The experimental errors 
will have to be improved a great deat before one can cons:cder the & decays to 


be evidence azainst the Al = % rule. 


Another test of the AI = 4% rule comes from the study of K>1+T7. 
The decay rute for KY > 9 49° is suppressed relative to Ke > nen by a factor 
500. Two pions in a Symmetric spatial state must have either I = 0 or I = 2. 
The wave function for ai+a°, with the spin-parity assignment 0, must there- 
fore be I = 2, whereas a neutral a state of two pions can have both I = 2 
and I = 0. Since the K meson has I = %, the Ai = % rule predicts that the 
decay K* an +n? is severely suppressed. Aiso, the AI = 4% rule predicts 


that the branching ratio 


ao) + - 
Ky * 7 +7 = 2/l 
—_————— = . 


Ko + 7° 4 9° 


Experimentally, the branching fraction for neutral KY decays is either 

0.264 0.02 or 0.35354 0.014 denending on which experiment you choose to believe. 
From the observed rate of K’ mt 47°, we know the amplitude for decay of the 
K into the I = 2 state. If we take into account the possibie interference of 
the I = 2 with the I = 0 amplitude in K decays, we find that the ratio 

Ko + 7° + 9°/K° + 1+m need not be exactly 1/3 but may be somewhere between 
0.28 and 0.38 (depending on the phase of the interference). Interference 
effects are thus quite large. 


Finally, consider the decays of the K into three pions. Experiment- 
ally it is known that the Dalitz pilot is almost flat, so that ali three pions 
appear to be in S states. Thus, the spatial wave function is totally symmetric. 
By use of the generalized Pauli principle, the isotopic spin-wave function 
must also be totally symmetric. Therefore oniy I = 1 and I = 3 is possible. 
There is only one totaily symmetric isospin wave function compatible with the 
ALl= “p rule, and that function has I = 1. The isotopic factors in the rates 


are given in the following table: 
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Weight 
Ko > 1° 4+ 0° 4+ 07° 3 
KP +a +0 +7° 2 
KT + og 474 7? 1 
Ki on 4a en 4 


The ratio of the first two, yan and the last two, " are correct even if 
AI = % is admitted (for only AI = % can reach I = 3). But the relation 
of the Kf rates to the K’ rates depends explicitly on the AI = '4 assumption. 
Since the available kinetic energy in these three pion decays is small, it 
is important to include also the change in phase space due to the fact that 
the charged pion is heavier than the neutral one. The data then, are in 
accord with predictions, but the tests are not yet very stringent. For 
example, the ratio of the last two rates given here as Y, is changed to 
0.325 by phase-space factors. Experimentally, the ratio is 0.30+ 0.04. 
Phase space predicts the ratio of the first two should be 1.89 (instead of 
%); experimentally it is 1.64 0.6. 


Up to the present point there could have been a mixture of AI = % 
and AI = Fes If there were, then the Ky rates would not be determined 
relative to the re rates. If the AI = y- rule is assumed then all four rates 
are related in the proportion given by the weights in the above table. 

Another way of putting this is that from the known rate K* + T+7+7, one can 
predict the rate for K? > n+ + aw. This turns out to be 3.140.2~x 10°/sec. 
There are two experiments giving 2.9+0.1x 10° /sec and 2.3+0.5x 10°/sec; 


both are thus consistent with the AI = Y% hypothesis. 


In summary, the evidence for the AI = Y% rule is quite good. The 
AI = ye amplitudes are smali, but not necessarily zero. There is no evidence 


for the AI = % amplitude. The intriguing questions are: 
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a) whether the AI = % rule could be an exact symmetry of the weak 


cuuplings, and 


b) if this is not so, how this rule can become so prominent dynamically. 


Attempts to deduce the AI = Y, rule 6th LECTURE 


In this lecture we shali present some of the attempts that have been 
made to deduce the Al = uy rule for the strangeness changing non-leptic decays. 
Let me emohasize that this mystery comes just from the isotopic spin properties 
of the J- J form assumed for the weak interactions, and is neither generated 
nor solved by Cabibbo's choice for the weak interaction current. No answer 
is known for this mystery. Therefore, we can only list some of the specula- 


tions that have been raised and criticize them. 


The first interesting question is, if we assume that the strangeness 
changing non-leptic interaction is of the form 


(o p) cos 9 sin®@ 


and calculate the amplitudes, then, are the AI = %, amplitudes dynamically 
suppressed or are the AI = % amplitudes dynamically enhanced, or both? 

There is no doubt that in the op form both the AI = % and AI s ¥, amplitudes 
are present with roughly equal weights. However, the strong interactions 
through renormalization effects wiil modify the relative weights of the AI = % 
and AI = "% contributions. It could be that the mesonic corrections are of 
such a form that the effective strength for the AI = Y% matrix elements is 
enhanced relative to those for which AI = %. In addition, we would also 
like to distinguish between the two cases AI = ‘4% enhanced, or AI = % 
suppressed. The problem would be very easily answered if we could calculate 
these renormalizations, but no reliable computation can be made. Therefore, 


we cannot answer the question in a unique and direct fashion. 
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Nevertheless, I would like to report on two littie calculations 
which may or may not be significant for the question. The calculations do 
represent an attempt to find the order of magnitude of the effects. You 
may not wish to consider this line of fiimsy reasoning; we are becoming very 
uncertain about this matter, nevertheless, I shall present it. Let us 


calculate the decay A + p+ , presuming that the graph 


A 


dominates the amplitude. Both vertices can be obtained from experiment. 
The Apw vertex comes from the experimental determination of parameters in 
the Cabibbo theory of the leptic decays, whereas the Wa vertex comes from 
the known decay rate of a. If one does this calculation, one finds both 
AI = % and AI = %, terms of equal magnitude and one gets an estimate for 
the rate of 8x 10’/sec. The reason that one gets both AI = % and AI = Y, 
terms is that there is no corresponding diagram for the decay A > n+a°. 

We now make the assumption that this rate gives a good estimate of the 
strength of the AI = x terms in the amplitudes. Now, if we compare this 
estimate with the actual rate of 2.4x 10°/sec, we see on the basis of this 
argument that the AI = vA amplitude must be considerably enhanced. However, 
this cannot be the whole story. The ratio of the two rates is one to 
thirty. That means the amplitude for the AI = %, is 1/30 = 0.18, but we 
already know from the discussion of the experimental validity of the AI = Y% 
rule that the AI = %, amplitude is less than 0.045 in magnitude. (There is 
a way out, perhaps, if the AI = %, amplitude violates CP. Then there would 
be no interference term, so that a AI = %, amplitude this large would not 
contradict the experimental branching ratio.) The result of this argument 


is that the AI = vu, term is enhanced and the AI = %, term is cut down. 
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However, since the amplitudes from this calculation are off by a factor of 


five, one way or the other, it is difficult to pretend that this calculation 


means very much. 


Another case in which we can make a similar calculation is the 


decay K*>7+T7. Consider the following diagram 


°\ 


The Kaw vertex can be calculated from the K + 7+ lepton-decay rate and, 

in a manner similar to the previous case, we find an estimate of 9x 10’/sec, 
whereas the experimental rate is 1.55x 10’/sec. Thus, the theoretical 
estimate must be cut down in reality. As explained previously, the K* 
amplitude comes only from the AI = %, terms, so that we have another hint 
that the AI = % amplitudes are suppressed. Furthermore, the experimental 
rate for K, + 7+ is considerably greater than what one could get from such 
an estimate. That seems to indicate that the AI = vu amplitude is con- 
Siderably faster than one estimates from such calculations. Thus, it looks 


like there is some other type of diagram that must dynamically enhance the 
Al = % amplitudes. 


There are two fundamentally different ways in which people have 
tried to approach the AI = Y% rule. The first proposal is that AI = % 
is anexact rule. To make an exact AI = % rule we have to augment the 
charged current-current interaction by adding neutral current terms for the 
strongly interacting particles. If we add a term of the form (30n) with 
the right amplitude to the J-J form, we then get a perfect AI = ", coupling. 
(We do not want to add neutral currents for the leptons because decays like 


K> ut e, K + v+¥v, and so on are not observed.) This theory results in 
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two predictions. First, that the AI = u% rule is exact and secondly, that 
some day a neutral weak-vector boson might be discovered. Thus, this theory 


predicts no more than the AI = % rule as far as the weak interactions are 


concerned. 


Can the Al = % rule be perfect? Since the K’ does in fact dis- 
integrate into two pions, the AI = % rule seems to be violated. People 
have long speculated that the Kt decay amplitude resuits from electromaxznetic 
corrections. If the process is electromagnetic, the rate should be down by 
a factor (e?)° ~ 1/20000, but the observed rate is sunpressed relative to the 
KY rate by a factor of oniy 1/500. The question of how to jack up the 
electromaznetic correction has occupied the minds of many theorists but no 
one has yet published an explanation that is satisfactory. Thus, no clarity 


has resulted from the hypothesis that the weak interaction transforms exactly 
like AI = %. 


The second point of view commonly adopted is that renormalization 
effects are decidedly different for the AI = % and the AI = %, amplitudes. 
Many people have tried to argue that there are certain diaxrams which are 
enhanced considerably and which transform only like AI = %. For example, 


—— 7 7 
consider the o pcross-term (pA) (np). From such a term there is a diagram 


of the form 


This type of diazram is not defined in perturbation theory, but it certainly 
may well be that such a term should be included in the correct expansion of the 
amplitude and it might be that it has a larye coefficient. You will note that 


because the A goes directly into the neutron, this diagram gives only a 
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AI = vA chanze. This exnlains why AI = % is so big. At the same time, 
this argument does not say Al = aA decays are impossible, but allows such 
decays to proceed at a small rate with perhaps some renormalizations. How- 
ever, since we really cannot compute anything, all that such a theory does is 
restate the experimental data in a different language. This theory thus pre- 


dicts nothing new. 


It seems that the same AI = % loop diagrams would result from 
Cabibbo's complete current. As an exercise you may write out ail the terms 
in the op form that lead to cannibalistic loop diagrams and check to see that 
they all give AI = %. 


In the octet framework the proposition that AI = u% terms dominate 
for strangeness changing non-leptonic interactions has been generalized by 
many people to the hypothesis that part of the weak Lagrangian transforms like 
an octet in the eightfold way. In particular the strangeness changing piece 
transforms like the "n' member of the octet. Since the J+J form is the 
symmetric product of two octets and the only symmetric multiplets contained in 
the direct product are 'Il', '8' and '27', the hypothesis consists of completely 
suppressing '27'. (Terms which transform as '1' do not lead to strangeness 
changing decays, of course.) Are there any new rules from the assumption of 
an octet other than those associated with AI = %? Most unfortunately there 
are no relations predicted among the reaction amplitudes that can be observed 
experimentally. Some predictions have been made, but these were based, in 
addition, on specific assumptions, e.g. there are no gradients in the coupling. 
This, however, is not in the same spirit as a prediction based only on 


symmetry principles. 


Benjamin Lee derived the relation 
-A(A + pt )+ 2A(= + Ata) = 3A(D” > p+), 


To derive this relation, however, he assumed R invariance in addition to SU;. 


(R invariance is the symmetry based on the transformation p + ee n-=s°, 


B* + ae and So on. This is simoly reflection through the centre of the octet.) 
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However, because the ='s and nucleons have such different masses, R invariance 


is not in agreement with the facts of nature. 


Cabibbo also concluded at one time that the KP +>7+7 was forbidden 
by the octet scheme, but this conclusion rested on the additional assumption 


that there were no gradients in the coupling. Since there is no reason why 


the matrix element should not involve gradients, the Ke is not forbidden to 
decay into two pions. 


It is rather interesting and disappointing that the assumption of 


definite SU; transformation properties for the non-leptic weak interactions 


has given us nothing. It seems that a qualitatively different idea is needed 


to clear up the puzzle of the non-leptic decays. I shall close these lectures 
with the hopeful speculation that some clever theorist may be able to tie 
together the fact that CP seems to be violated in weak interactions, and the 


fact that there are certain unexpected isotopic selection rules in the weak 
interactions. 


Weak decay data_summary 


Muon Mass, 105.655+ 0.01 MeV, and lifetime, 2.212+ 0.001 x 10° 


the weak decay constant to be 


sec, determine 


- 1 - 
G = (1.0233 + 0.0004) x 10°” (ne) 2 nu 


(M = proton mass). 
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Nuclear @ decay 0'4 + N'* (corrected for electromagnetic effects) implies 
that the value of G for the vector part of the current is 0.980+ 0.005 times G. 
Other B-decay experiments, including neutron decay asymmetry and lifetime 

of 1013+ 29 seconds, suggest that the axiai coupling is -1.26 times the 


vector coupling. 


nm [Mass 139.59+ 0.05; lifetime (2.55+ 0.03) x10]. The branching ratio 
a > e+ v/a” > y+ v agrees with predictions of theory to + 2%. The 


Goldberger-Treiman formula for amplitude a > y+ v is too small by 8%. 


K* [Mass 493.91 0.2; lifetime (1.2244 0.013) x 10” sec]. The branching 
is 64%+ 4% into p+v; (19+1)% to a+? 5.74 0.33 tom + 14+; 

(1.74 0.2)% to a+ n° +m°; and 7.6+1.1% to leptons and 7°. For decay 
into 7+ leptons, KY + 7° +p +v and KY + 7°+e +, total rate is 

6.2+0.9 x 10° sec. If the amplitude is written (py + p,) + E(py-p,)s 

then the spectrum of yp indicates that € is not large. Ratio of muon to 
electron decay is: theoretical, 0.65+ 0.13€+ 0.02€?; experimental, 


0.70¢0.06. Ki +7 +7 +e +p is 2,340.7 x 10 of total disintegration. 


Ke [Mass 497.8 0.6, lifetime (1.v0t 0.04) x10 °° sec] goes into 7° +7° or 
+ - 
u™ +m . Branching ratio to 7° +7° by two experiments is 26+ 2% or 


33.5141.4%. 
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K? (Mass more than K, by about 0.8 h/T(K,); rate of decay 18+ 2x 10° 

disintegration per sec.) For decay into leptons a + ies v and aha oa v 
rate is 11.1+1.2x 10° sec’. For decay into 3r's (1° +97°+7° and n +n + n°) 
the rate into charged pions is, by one experiment, 2.9 + 0.7x 10° sec” and 
by another is given as 0.171 40.023 times the totai rate into any charged 
products (therefore pions+ leptons). There is some evidence for decay into 
two pions tea, thereby violating CP invariance at a branching ratio near 

=-3 
2x10 . 


485 


*SaARM gq pue ¢g Joy sapnzyrtdwe q‘s 


eld] + els| elal+els] ela] els] 


Xd e. = ‘ = 
tt 2) Reape 7 I sla] ~ 2/8] + ‘des oa z 


(1 


50! X6°E 09 -9°Z SFIEl 


; (% utds jt) 
g0°0 F08°0 (L1°0 + 7£°0) 80°0 9 }-Ob XSO°O F2L°I ZO°L + 70°SIE1 


oO XS0°0 F 6G") 0£°0 5 70-2611 


80°0 82°O 
g0°0 + £0°0 on! % 90°0 07°0 5076811 


70°07 8l°O + G0°0 + 29°0 utd 


£0°0 7 80°0 + 02°0 =96°0 (%% 99)% oh +N 9 x0! X 60°0 FeO 9EStll 


oTqe1 ZuUTYyouRIg 9S aUTIIITT Aan SSP aToTzed 


sXeaap 913daT-uou uoradAy 


SNOUSd AH 


486 


~ 174 - 


Hyperon leptic decays branching fraction 


UFI Theory Exoeriment 


1°5x10~ 0.81+0.10x 10” 
0.2x 10° ~1x10~ 
2 -3 

ErtAtetryp 2.0x 10 (2 or 3)x10 
D+Nte +v 5.8x 10 1.374 0.34x 10> 
Neyo ty 2.6x 10 0.664 0.14x 10 
1.0x 107 0.07+ 0.03x 10“ 


0.6x 10~ 0.07+ 0.04x 10 


UFI means rate according to universai coupling constant 
G times V- A. If actual coupling in G times s(V- tA) 


then the rate is approximately s*/4 (1+ 3t?) times as 
much « 


Asymmetry and spectrum of A decay indicate that t is 
approximately | for this decay. 
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V.B Partons — Quarks and Gluons 


After the parity revolution, the great achievement in the field of elementary particle interac- 
tions was the discovery of the composite nature of hadrons. Perhaps the earliest suggestion 
was the Fermi—-Yang model of a pion composed of a nucleon—antinucleon pair, later elabo- 
rated and extended to strange particles by Sakata and his school by inclusion of the lambda 
hyperon.! A sophisticated approach to compositeness, which for a time was dominant in 
theoretical physics, was the S-matrix bootstrap hypothesis, or “nuclear democracy,” i.e. the 
doctrine that all hadrons (baryons and mesons) are made of each other. Feynman was fol- 
lowing the development of S-matrix theory through dispersion relations, Regge poles, etc., 
and he learned the lessons they taught about compositeness, but he did not himself partici- 
pate in this approach. On the other hand, he played an important role in the growth of the 
strong interaction sector of the Standard Model, according to which hadrons are composed 
of fermionic quarks interacting through the exchange of bosonic gluons, the quanta of the 
color gauge field.? 

Some of the theoretical methods pioneered by Feynman in his studies of QED, rieson 
theory, and quantum gravity, were important, if not essential, for proving the consistency 
of the gauge theories which are central to both the strong and electroweak sectors of the 
Standard Model — methods such as Feynman graphs, path integrals, ghost propagators, 
regularization, and renormalization.? While Feynman was not otherwise involved in devel- 
oping the theory itself, his intuition and wisdom were very valuable to those who were, as 
were his original insights into gauge theory. We have included a pedagogical account which 
illustrates this point in the form of lectures that he delivered at a summer school — paper 
[102]. 

Feynman’s other major contribution to the development and the acceptance of the Stan- 
dard Model was to analyze the relevant experiments, together with his students and asso- 
ciates at Caltech, and to compare them to theoretical predictions, which were extraordinarily 
difficult to calculate with accuracy. 

Being interested in particle physics at Caltech, Feynman could hardly fail to interact 
strongly with Murray Gell-Mann and George Zweig, who proposed the existence of the frac- 
tionally charged hadronic constituents which they, respectively, called quarks or aces. In 
terms of these hypothetical spin 1/2 objects they formulated the hadronic currents, intro- 
duced by Gell-Mann, through which the weak and electromagnetic interactions of hadrons 
are expressed. In paper [61], Feynman joined his colleagues in extending Gell-Mann’s al- 
gebra of quark currents to the group U(6) @ U(6). It should be noted that the study of 
these symmetry groups was the major alternative approach to that of the bootstrap school; 
Gell-Mann himself was the leading figure in the symmetry school, although he had also been 
a pioneer in the study of dispersion relations and bootstrapping. 


1E, Fermi and C.N. Yang, “Are mesons elementary particles?”, Phys. Rev. 76 (1949): 1739-1743; S. Sakata, 
“On a composite model for the new particles,” Prog. Theor. Phys. 16 (1956): 686-688. Other composite 
models were proposed by M.A. Markov and by M. Goldhaber. 

*See, e.g., The Rise of the Standard Model, edited by L. Hoddeson, L.M. Brown, M. Riordan, and M. Dresden 
(Cambridge: 1997). Referred to later as RSM. 

3See the theoretical articles in RSM, especially those of M. Veltman and G. ’t Hooft. 
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By 1968 a number of physicists had become believers in quarks and were making quark 
models.* Yoichiro Nambu had proposed the color gluon gauge field. Experimentalists work- 
ing at the Stanford Linear Accelerator Center (SLAC) were on the threshold of discover- 
ing quarks.® Nevertheless, the theory of quarks was sufficiently new and problematic (for 
example, the nonobservation of fractionally charged particles and the “paradox” of quark 
statistics, in the absence of color) that many physicists (including Gell-Mann, but not Zweig) 
doubted their reality.’ 

A SLAC experimentalist, Michael Riordan, author of a popular history of the quark 
discovery, wrote that in June of 1968 Feynman “began thinking of each hadron as a collection 
of smaller parts or entities, which, for lack of any better name, he dubbed simply ‘partons’... 
In this picture the chance of two hadrons colliding was just the sum of the chances of any two 
of their partons colliding.”® Riordan reported that when Feynman visited SLAC in August 
of that year and saw the experimental results on deep inelastic scattering, “Feynman had an 
epiphany” when he realized that the observations measured “in some way the momentum 
distribution of his partons!”? James Bjorken, a theorist at Stanford who advised the SLAC 
experimenters, had deduced that the unexpectedly large deep inelastic electron cross sections 
(more like the collision with pointlike charges in the proton than the expected diffuse cloud 
of mesons) were related to a phenomenon called “scaling,” which meant that the results 
depended on a particular combination of two scattering parameters (energy and momentum 
transfer), rather than on each independently. However, the theoretical reasoning to arrive at 
this conclusion was difficult for experimentalists to grasp. Thus the explanation of scaling 
in terms of Feynman’s partons soon became the accepted one.}° 

Papers [75] and [76] deal with very high energy collisions of baryons — say, two protons — 
analyzed in their center-of-momentum system, with [75] being a much condensed version of 
[76]. In these papers, Feynman defined two types of measurements which can be made on the 
results of these collisions: “exclusive,” in which all the outgoing momenta are observed, and 
“inclusive,” in which one or a few measurements are made. Because of the high multiplicities 
which generally occur in these collisions, most of the observations are inclusive and Feynman 
discussed what can be learned on the basis of such partial observations. His analysis was 
based on his parton picture, which is a kind of abstraction from quantum field theory. He 
described his approach as follows:!} 


It is not that I believe that the observed high-energy phenomena are necessarily a 
consequence of field theory. Even less do I know what specific field theory could yield 


“Harry Lipkin, “Quark models and quark phenomenology,” in RSM, pp. 542-560. 

®Y. Nambu, “A systematics of hadrons in subnuclear physics,” in Preludes in Theoretical Physics, edited by 
A. de Shalit, H. Feshbach, and L. van Hove (Amsterdam: 1966). 

®For the discovery of quarks, see the 1990 Nobel Lectures of Richard E. Taylor, Henry W. Kendall, and 
Jerome I. Friedman (general title: Deep Inelastic Scattering), Rev. Mod. Phys. 63 (1991): 573-595, 596-614, 
615-627, respectively. 

"For the rejection by many theorists of the real existence of quarks, see especially pp. 598-600 of Kendall’s 
lecture (note 6). 

8Michael Riordan, The Hunting of the Quark (New York: 1987), p. 149. 

*Riordan (note 8), p. 150. 

1°However, Feynman did not publish this explanation until 1972, in item [89]. Bjorken has discussed scaling 
in “Deep-inelastic scattering: from current algebra to partons,” RSM, pp. 589-599. This article has two 
sections headed respectively “Before Feynman” and “After Feynman,” referring to the effect of the visit to 
SLAC mentioned above. 

‘1 Paper [76], p. 240. 


489 


them. But, rather, I believe that they share some of the properties of field theory, 
so they might share others... What field theory shall I choose? What shall be the 
fundamental bare particles that the theory begins with? I do not know and perhaps I 
do not care. I shall try at first to get results which are more general and characteristic 
of a wide class of theories and which can be stated in a way independent of the field 
theory which served as a logical crutch for their discovery. 


Papers [83], [90], and [95] are explicitly about the parton theory and its applications 
to deep probes of hadronic matter and to hadronic collisions. The same is true of most 
of Feynman’s 1972 book Photon-Hadron Collisions ({89], not included in the present vol- 
ume). In paper [90] he extends his analysis of deep inelastic scattering using neutrinos to 
probe the nucleon, rather than electrons. (Of course, in the former case the probes are 
virtual massive vector bosons, and in the latter they are virtual photons.) Each time that 
Feynman lectured he was questioned about whether “parton” was just another name for the 
quark. Feynman always answered that it was for experiment to decide whether the funda- 
mental constituents carried fractional charge or not. Actually, he liked to point out, the 
number of fermionic partons was infinite because of the “sea” of virtual pairs which were 
effective at high excitation energies. In addition, the experiments showed that about half 
of the longitudinal momentum (in the center-ofmomentum system) carried by the proton 
was in the form of neutral partons, which could be the postulated gluons. The possible 
relation of quarks and gluons to partons is discussed in some detail in the closing chapters of 
[89]. 

When we come to papers [103] and [104], the language of partons has been replaced by 
that of quarks and gluons. The theory that is used is quantum chromodynamics (QCD), 
and the fundamental collision processes include not only quark—quark, but also quark-gluon, 
and even gluon-gluon, scattering. Paper [104], with Rick D. Field and (the later-submitted) 
paper [103], with Field and Geoffrey C. Fox, are theoretical studies of the production of 
quark and gluon jets, the latter made before the accumulation of relevant data.!? These jets 
arise from cascade processes in which each quark and gluon produced in a collision turns into 
a rapid stream (i.e. ajet) of hadrons. The analyses of Feynman and his associates exploit the 
important property of QCD called asymptotic freedom (the vanishing of interaction between 
quarks having high relative momentum). 


Selected Papers 

[61] With M. Gell-Mann and G. Zweig. Group U(6) x U(6) generated by current components. 
Phys. Rev. Lett. 13 (1964): 678-680. 

[75] Very high-energy collisions of hadrons. Phys. Rev. Lett. 23 (1969): 1415-1417. 

[76] The behavior of hadron collisions at extreme energies. In High Energy Collisions. 
London, Gordon and Breach (1969), pp. 237-256. 

[83] Partons. In The Past Decade in Particle Theory. London, Gordon and Breach (1971), 
pp. 773-813. 

[95] Partons. In Proc. of the 5th Hawaii Topical Conference in Particle Physics. Honolulu, 
Univ. Press of Hawaii (1974), pp. 1-97. 

[102] Gauge theories. In Weak and Electromagnetic Interactions at High Energy. 
Amsterdam, North-Holland (1977), pp. 121-204. 


12For gluon jets, see San Lan Wu, “Hadron jets and the discovery of the gluon,” in RSM, pp. 600-621. 
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[103] With R.D. Field and Geoffrey Fox. A quantum-chromodynamic approach for the large- 
transverse-momentum production of particles and jets. Phys. Rev. D18 (1978): 3320-3343. 
[104] With R.D. Field. A parameterization of the properties of quark jets. Nucl. Phys. 


B136 (1978): 1-76. 
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GROUP U(6)®U(6) GENERATED BY CURRENT COMPONENTS* 


R. P. Feynman, M. Gell-Mann, and G. Zweigt 
California Institute of Technology, Pasadena, California 
(Received 2 November 1964) 


It nas been suggested!~° that the equal-time 
commutation rules of the time components of 
the vector and axial-vector current octets 
(F;q and 5;,°, respectively) are the same as 
if these currents had the simple form Gj, and 
Siq°) defined as follows: 


Lom 
SG. =31gr. 
Jig 2tq Ma? 


5a do 1 
Gig 7 BAL gY5% (1) 
where q is an SU(3) triplet with spin 3—for exam- 
ple, the quarks? or aces.* Here the matrices 

dq G@=1,-*-,8) are the SU(3) analogs of the Pauli 
matrices, as defined in reference 1. The opera- 


tors 
= —¢ {73 
F (t= i fae S 4) 
F B(t)=—i Jax 5,45, (2) 


then generate at equal times the algebra of 
SU(3)@SU(3), which may be a very approximate 
symmetry of the strong interactions, '»? while 
the F; generate a subalgebra corresponding 

to SU(3), which is a fairly good symmetry of 
the strong interactions. 

We now propose to extend these considera- 
tions to the space components of the currents 
as well. First we define’~° a ninth a matrix 
do= G)/71 and a corresponding ninth pair of 
currents Fog and Fgq° (where Fo, would be 
v6 times the baryon current in a true quark 
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or ace theory). We then assume that the equal- 
time commutation relations of all the 72 com- 
ponents of the 5;, and 5;,° (i =0, ++, 8) are the 
same as those of the ¢;, andg,,°, at least as 
far as terms proportional to the spatial 6 func- 
tion are concerned. (There are also, in gen- 
eral, terms? involving gradients of the 6 func- 
tion, which vanish on space integration and 
which we ignore here.) The system of Sq 
and ¢;q° is closed under equal-time commu- 
tation, and the space integrals JSiqd*x and 
I[giq*d*x generate the algebra of U(6)@U(6). 
Our assumption thus implies that {¢;gd° and 
Si 9°d* also generate the algebra of U(6) 
@U(6). We assume further that this algebra 
is a very approximate symmetry of the strong 
interactions. 

We now exhibit some of the structure of the 
algebra by looking at the G;g and g;,°. We 
note that the space integrals of the densities 


DY 44 =qtag (= 0, eee, 8) 
and 
LS ea qta,og (n=1,2, 3) 


generate the subalgebra corresponding to u(6); 
the same is then true of the corresponding 
components of the $’s. We may refer to the 
algebra of the space integrals of these § com- 
ponents as the A spin, with generators A, 
(r=0,1,+°--,35). Now the space integrals of 
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the densities 91A,3(1+y5)q and gtr;0, 1 +¥5)4 
also generate a group U(6), and so do the cor- 
responding terms with 3(1-y,). The corre- 
sponding integrals of ¢§ components thus give 
a left-handed A spin A,* and a right-handed 

A spin Ay; respectively, with 


= a = i ene 
A 2A "+A ~ (7=0,1,+++,38). (3) 


Those 36 components of §,,, and 5;,° (out of 
a total of 72) that are the densities of the A, 
do not go just into themselves under Lorentz 
transformations, but yield instead the com- 
plete system of 72 components of the 5; 
F 


q and 
ig? Which form the densities of A,” and A,~ 
We have assumed above that the At and A~ 
spins are separately very approximate symme- 
tries of the strong interactions. We may now 
add the further assumption that the total A 
spin is a good symmetry, nearly as good as 
the subset that constitutes the F spin. This 
approximate conservation of A spin is then 
our way of describing the success achieved 
by the SU(6) symmetry of Giirsey and Radi- 
cati,® Sakita,’ and Zweig,® treated further in 

a series of recent Letters.°~'* In reference 
10, our interpretation of the symmetry is 
hinted at, but otherwise it is described in dif- 
ferent language, which does not make clear 
the physical identification of the symmetry 
operators with integrals of components of the 
vector and axial-vector currents occurring 

in the weak and electromagnetic interactions. 
Also, the Lorentz-complete system, obeying 
the commutation rules of U(6)® U(6), is not 
given. 

In a relativistic situation, where a state like 
p exists part of the time as 27, part of the time 
as N+N, part of the time as A+A, etc., with 
a different set of channel Spins in each case, 
it is evidently not sufficiently specific to talk 
of “spin independence” of strong interactions. 
In contrast, our statement in terms of the ap- 
proximate conservation of the Ganiow-Teller 
operator [G,,°d°x (n =1, 2,3) does have a def- 
inite meaning. 

One set of consequences of our approach is 
that the Gamow-Teller matrix elenients with- 
in an SU(6) supermultiplet can be exactly com- 
puted in the limit of SU(6) symmetry. We 
adopt the assignments of the /7 = 3* baryon 
octet and J” = #* baryon decimet to the SU(6) 
representation 56, and the assignment of the 
vector-meson octet and singlet and the pseudo- 
scalar octet to the representation 35; these 
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assignments have explained at least six well- 
known facts.’® The axial-vector strength, with- 
in the baryon octet, comes out to be 1{D) + 3(F); 
for the nucleon, this gives (-G,4/Gy) =5/3, 

as indicated in reference 10, to be compared 
with an ObServed value more like 1.2. The 
agreement is fair, as is the agreement of the 
D/F ratio with the results on leptonic hyperon 
decays. The matrix elements of the Gamow- 
Teller operator between octet and decimet are 
also exactly specified in the limit of SU(6) sym- 
metry and can be checked by neutrino experi- 
ments. 

Let us now go on to discuss the badly broken 
symmetry U(6)@ U(6), which bears about the 
same relation to U(6) symmetry as the U(3) 

@ U(3) symmetry generated by the time compo- 
nents of vector and axial-vector currents’? 
bears to the eightfold way. On the way from 
the full U(6)@ U(6) down to U(3), we could pass 
through U(6) or through U(3)@ U(3) symmetry 
as an intermediate stage; these are alterna- 
tives in somewhat the same way as are L-S 
and j-j coupling in atomic physics. It seems 
that the operators of U(6), all of which have 
nonrelativistic limits, form a much better 
symmetry system than those of U(3)®@ U(3); 
hence, the useful procedure is to go from U(6) 
® U(6) to U(6), and then to U(3) and U(2). [Actu- 
ally U(6) is not much worse than U(3).| 

The baryons are presumed to have zero mass 
in the limit of U(6)@ U(6) symmetry, as in the 
limit of U(3)@ U(3) symmetry.’*? The pertur- 
bation that reduces the symmetry of U(6) is 
assumed to transform like (6, 6*) and (6*, 6) 
under (At, A7), and like 1 under A. Thus it 
transforms like a common quark mass term 
qq, which takes a left-handed q going like (6, 1) 
into a right-handed q going like (1, 6), and vice 
versa.!© The J7=3* octet and J* = §* deciniet 
belonging to 56 can be placed either in (1, 56) 
and (56,1), or in (6, 21) and (21,6), if we re- 
strict ourselves to representations that trans- 
form like 3q. The latter is very attractive, 
because it splits into a 56 and a 70, where the 
masses to first order in the perturbation are 
in the ratio 1:-2; as in reference 3, we must 
interpret negative mass as positive mass with 
negative parity, and so we are led to a 56 with 
unit mass and a 70 with opposite parity ‘and 
roughly twice the mass. The 70 contains a 
37 octet, a 37 singlet, a 37 Octet, and a 47 
decimet. Thus the prediction of reference 3 
that the 3* octet is accompanied by a 47 sing- 
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let of roughly twice the mass is contained in 
our present result. The 37 octet has probably 
been seen [including N{1512)], but the 47 oc- 
tet and decimet have not so far been identified. 

In the limit of U(6)@ U(6) symmetry, the 
vector and pseudoscalar mesons of the 35 can 
be put into either of two pairs of representa- 
tions that transform like g+q. The mesons 
could go like (35,1) and (1,35), or else like 
(6, 6*) and (6*, 6). If they belong to the adjoint 
representation pair (1,35) and (35,1), as the 
current components do, then the usual 35 is 
accompanied by another 35, consisting of a 
normal axial-vector octet and singlet and an 
abnormal scalar octet. [Here, “normal” means 
that the Y=0, /=0 member of an axial vector, 
scalar, or pseudoscalar SU(3) multiplet is 
even under charge conjugation; “abnormal” 
means it is odd.] If the mesons belong to (6, 6*) 
and (6*, 6), then the usual 35 is accompanied 
by al (a normal pseudoscalar Singlet), anoth- 
er 1 (a normal scalar singlet), and a 35 con- 
sisting of an abnormal axial-vector octet and 
singlet and a normal scalar octet. In either 
case, the perturbation that reduces U(6)@ U(6) 
to U(6) does not split the mesons into U(6) mul- 
tiplets in first order; in second order, they 
are split. The assignment to (6, 6*) and (6*, 6) 
is appealing because the pseudoscalar singlet 
could be icentified with (960), the scalar oc- 
tet may include x(725), and the abnormal axial 
octet may include the meson at about 1220 MeV 
with J=1 that decays into 7+w. 


*Work supported in part by the U. S. Atomic En- 
ergy Commission. Prepared under Contract No. 
AT(11-1)-68 for the San Francisco Operations Of- 
fice, U. 5S. Atomic Energy Commission. 

+Work partially supported by the U. S. Air Force 
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The ratio of p and n magnetic moments, the rela- 
tion of octet and decimet spacing, the initial degen- 
eracy of g and w, the amount of mixing of g and w, 
the equality my"-m, =mgx—m 2\ and the absence 
of appreciable mixing between 7 and (960). 

16 at the end of reference 3, it is suggested that 
perhaps the perturbation in the energy density that 
reduces U(3)@ U(3) symmetry to U(3) symmetry 
generates, together with the algebra of U(3)@ U(3), 
a small algebra, which could be that of U(6). Such 
a use of U(6) is not the same as the use we are dis- 
cussing in this Letter. However, we might consid- 
er the analogous possibility that the perturbation 
in the energy density that reduces U(6)® U(6) to 
U(6) generates, together with the algebra of U(6) 

@ U(6), the algebra of U(12), corresponding to all 
unitary transformations on the four Dirac compo- 
nents and the three unitary-spin components of a 
quark field. Even if this is true, of course, U(12) 
need not be a useful symmetry of strong interactions. 
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VERY HIGH-ENERGY COLLISIONS OF HADRONS 


Richard P. Feynman 
California Institute of Technology, Pasadena, California 
(Received 20 October 1969) 


Proposals are made predicting the character of longitudinal-momentumi distributions 


in hadron collisions of extreme energies, 


Of the total cross section for very high-energy 
hadron collisions, perhaps 4 is elastic and 10% 
of this is easily interpreted as diffraction disso- 
ciation. The rest is inelastic. Collisions involv- 
ing only a few outgoing particles have been care- 
fully studied, but except for the aforementioned 
elastic and diffractive phenomena they all fall off 
(probably as a power of the energy at high ener- 
gy). The constant part of the total inelastic cross 
section cannot come from them. And we know 
that at such energies, the majority of collisions 
lead to a relatively large number of secondaries 
(perhaps the multiplicity increases logarithmi- 
cally with energy). These collisions have not 
been studied extensively because, with the large 
number of particles, so many quantities or com- 
binations of quantities can be evaluated that one 
does not know how to organize the material for 
analysis and presentation. 

It is the purpose of this paper to make sugges- 
tions as to how these cross sections might be- 
have so that significant quantities can be extract- 
ed from data taken at different energies. These 
suggestions arose in theoretical studies from 
several directions and do not represent the re- 
sult of consideration of any one model. They are 


an extraction of those features which relativity 
and quantum mechanics and some empirical facts! 
imply almost independently of a model. I have 
difficulty in writing this note because it is not in 
the nature of a deductive paper, but is the result 
of an induction. Iam more sure of the conclu- 
sions than of any single argument which suggest- 
ed them to me for they have an internal consis- 
tency which surprises me and exceeds the con- 
sistency of my deductive arguments which hinted 
at their existence. 

Only the barest indications of the logical bases 
of these suggestions will be indicated here. Per- 
haps in a future publication I can be more de- 
tailed.” 

Supposing that transverse momenta are limited 
in a way independent of the large z-component 
momentum of each of the two oncoming particles 
in the center-of-mass system (so s = 2W’), an 
analysis of field theory in the limit of very large 
W suggests the appropriate variables to use for 
the various outgoing particles in comparing ex- 
periments at various values of W in the c.m. sys- 
tem. They are the longitudinal momentum P, in 
ratio to the total available W, i.e., x=2,/W, and 
the transverse momenta & in absolute units. 
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Differential cross sections for dx d°& of the var- 
ious Outgoing particles will then have simple 
properties as a function of W. Negative x means 
particles with P, negative. 

First we must distinguish exclusive and inclu- 
sive experiments. In exclusive experiments, we 
ask that certain particles, with given x and ¥Y, be 
formed, and no others. An example is a two- 
body charge-exchange reaction. A typical exclu- 
sive reaction is 


n nt 
A+B=+))C;+2)D,, 
1 1 


where A is to the right, B to the left, and C,,C,, 
+++,C, are definite particles with definite Q’s 
and x’s all moving to the right (x >0), whereas 
the D,,D,,++*,D, are moving to the left (x <0). 
The cross section should then vary, for suffi- 
ciently high W, as 52% ~2 or (w?)?*“) ~2, where 
a(t) is the a of the highest Regge trajectory ca- 
pable of converting the quantum numbers ofA in- 
to those of the sum of the C’s, and ¢ is the trans- 
verse momentum difference of A and the sum of 
the C’s, 

This is an evident expectation from Regge the- 
ory and should be as approximately valid as that 
theory is for two-body reactions. The additional 
point here is the clarification of using the vari- 
able x to compare experiments at various ener- 
gies. If no unitary quantum number is required 
to be exchanged, so the C group has the same 
quantum numbers as A, this C group can rise 
from A via diffraction dissociation and the cross 
section should approach a constant ratio to the 
elastic cross section at this value of ¢ (i.e., it 
should be constant if the elastic cross section 
is). 

Next, an inclusive experiment is one in which 
we look for special particles with x, @ in the fi- 
nal state but we allow anything else to be pro- 
duced also. An example is a measurement of the 
mean number of K*’s produced with given Q,x in 
a pp reaction. Such cross sections should ap- 
proach a constant as W ~~, 

How can these be reconciled? Why does the 
cross section fall if, for example, in a two-body 
reaction we must exchange 3-~component of isoto- 
pic spin? Because under such circumstances, 
the current of 3-component isospin must suddenly 
reverse from right moving to left moving. Thus 
if any fields are connected to such currents as 
sources, they would be expected to radiate (in a 
manner analogous to bremsstrahlung). To be an 
exclusive experiment (say, pure two-body), we 
require that no such radiation occur, a condition 
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becoming more and more difficult to satisfy as 
the energy rises and the current reversal is 
sharper. 

This leads us to expect, in the majority of col- 
lisions, many particles over a wide range of x, 
but their characteristics for the smaller values 
of x are easy to envision. By Lorentz transfor- 
mation, the fields to be radiated are becoming 
narrower and narrower in the z direction as W 
rises. The energy in this field is therefore dis- 
tributed in a 6 function in z. Fourier analyzed, 
this means that the field energy is uniform in 
momentum, dP,. Since each particle of mass 
carries energy & =(y7+P,?+@’)?, if we suppose 
that the field energy is distributed among the 
various kinds of particles in fixed ratios (inde- 
pendent of energy W), we conclude that the mean 
number of particles of any kind and of fixed Q 
is distributed as dP,/E for not too large x. That 
is, the probability of finding, among all the emit- 
ted particles, a particle of kind i, transverse 
momentum Q, and mass uy, is of the form f,(Q, 
P,/W)dP,d?Q/(u 7? +@+P,7)'”, where f;(Q, x) is 
ultimately independent of W and has a limit £,(@) 
for smallx, As W-~, for any finite x, dP,/E 
becomes dx/x, of course. 

Because of this dx/x behavior, the mean total 
number (or “multiplicity”) of any kind of particle 
rises logarithmically with W. We need not de- 
cide what are “primarily emitted units” and what 
are secondaries arising from their decay, for 
the results so far stated are in a form that does 
not depend on that. If we imagine some primary 
independently emitted units, however, their num- 
ber 7 would also rise logarithmically with ener- 
gy, and the probability that none of them would 
be emitted might be e “7 (as suggested by a Pois- 
son distribution) which would then fall as a power 
of the energy, accounting for the Regge expres- 
sions which we are supposing are valid for such 
exclusive collisions. 

We can extend this idea to other amplitudes 
which involve a similar 7. In particular, we find 
that the probability that A+B-C + anything 
should vary as (1-x¢)'"?% @x¢, where C is mov- 
ing to the right with almost all the momentum of 
A (that is, for 1~x¢ small). Here a(é) is the high- 
est trajectory (excluding the Pomeran chukon) 
which could carry off the quantum numbers (and 
squared momentum transfer ¢) needed to change 
Atoc, 

Thus the Regge trajectory function a(t) appears 
not only in an interaction (as s?*~?) but also in an 
emission process, reminiscent of the close re- 


496 


VOLUME 23, NUMBER 24 


lationship of virtual interaction and real emis- 
sion that Yukawa emphasized. 

Finally, for those special reactions which are 
partially exclusive, in which anything can be 
emitted except that a hadron must be transferred 
from the right-moving to the left-moving system 
(carrying its fermionic and half-integral spin 
character) the cross section should vary as 1/s. 
Of this last conclusion, I am less sure than of 
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the others. 


'Thave taken the approximate existence of Regge poles, 
the constant total cross sections, and the constant 
transverse-momentum distributions as empirical facts. 

2For a somewhat more detailed description, see 
R. P, Feynman, in Proceedings of the Third Topical 
Conference on High Energy Collisions of Hadrons, Sto- 
ny Brook, N. Y., 1969 (to be published). 
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high energy hadron scattering. Firstly, most theoretical inventions 

are based on analysis of simple collisions, in which only a small number 
of particles come out. But it is at once realized that questions of 
unitarity, the asymptotic behavior for high energy in dispersion integrals, 
etc., require some ansatz be made for the higher energy collisions, in 
order to close the infinite hierarchy of equations which result. Secondly, 
experiments at high energies usually yield many particles, and only by 
selecting the rare collision can we find those about which the theorist 
has been speaking. For the highly multiple inelastic collisions (to 

which the major part of the inelastic cross section is due) so many 
variables are involved that it is not known how to organize or present 
this data. Any theoretical suggestion (even if it proves to be not quite 
right) suggests a way that this vast amount of data may be analyzed. For 
this reason I shall present here some preliminary speculations on how 
these collisions might behave even though I have not yet analyzed them 

as fully as I would like. 

Hadron phenomena possess a number of remarkably simple properties. 
Besides the well-known agreements with relativity, analyticity, unitarity, 
etc., there are of course the conservation rules of isospin and strange~ 
ness and an approximate agreement with SU, symmetry. In addition to these, 
however, we have some special regularities which appear to be true 
empirically which apply to very high-energy behavior and which may point 
the way to an ultimate dynamical theory. A partial list of these 


regularities at very high energy are: 


a constant fraction of this. 

(2) Exchange reaction cross sections fall with a power of the energy 
goo"? , 

(3) This power, Q, seems to be t-dependent, and for those +t = in 
where a(t) is an appropriate integer (or half-integer) there are reso- 
nant particles of mass m (t is the square of the four-vector momentum 
transfer in the collision). 

(4) a(t) varies with t as a straight line a(t) = a,+7t. Although 
a, varies with the quantum numbers which must be exchanged, the value of 
y is the same (0.95 per (cev)*) for all. 

(5) Cross sections fall very rapidly with transverse momentum transfer. 
The average transverse momenta of the particles in inelastic collisions 
are limited (to about 0.35 GeV). 

There are, Of course, very many rough approximations in this brief 
summary. For example, we do not know if the total x-section might not 
vary very slowly (for example, like én s); all the slopes of Regge 
trajectories are not exactly equal; whether the pion nonet is on a 
trajectory of such a slope is unknown; there are corrections to the 
simple “Regge expectations" (2), (3), presumably for absorption; some of 
the inelastic cross sections can be associated with diffraction disso- 
ciation of the elastic part; etc. Nevertheless, the list above contains 
a number of main phenomena whose general behavior must ultimately be 
understood. It will be noticed that in discussing the power laws asso- 


ciated with Regge behavior, I have explicitly separated the behavior of 


the total x-section which is often described as the Pomeranchuk trajectory, 
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for it is not certain that this is a typical trajectory. In trying to 
understand the meaning of these regularities of high energy behavior, 
I have been led to suggest certain further regularities accompanying 
them. I should like to present these guesses here to see if they are 
possibly true, or, if some of them are obviously in disagreement with 
experiment, to learn where I may have already gone off the track in my 
thinking. 

I shall, for completeness, first explain how I am trying to go about 
analyzing these things; second, describe some special considerations 
dealing with Regge exchange; and finally, present my suggestions for the 


limiting behavior of cross sections at high energies. 


I. FIELD THEORY AT HIGH ENERGY 

In order to think about these questions I wish to use concepts which 
will immediately insure that the most fundamental properties of relati- 
vistic invariance, quantum superposition, unitarity, ete., will automati- 
eally be satisfied. The only theoretical structure I know which has a 
chance of doing this is a quantum field theory (I say, a chance, because 
we are not sure if unrenormalized field theories can give finite answers, 
or if renormalized theories are still unitary). It is not that I believe 
that the observed high-energy phenomena are necessarily a consequence of 
field theory. Even less do I know what specific field theory could yield 
them. But, rather, I believe that they share some of the properties of field 
theory, so they might share others. Therefore, I wish to study the behavior 
expected from field theory for collisions of very high energy. What field 


theory shall I choose? What shall be the fundamental bare particles that 
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the theory begins with? I do not know and perhaps I do not care. I shall 
try at first to get results which are more general and characteristic of a 
wide class of theories and which can be stated in a way independent of the 
field theory which served as a logical erutch for their discovery. Only 
later, possibly, might it be worth trying to see if certain special experi- 
mental details imply something about a special theory which underlies all 
these phenomena. 

In the meantime I call the fundamental bare particles of my under- 
lying field theory "partons" (which may be of several kinds, of course). 
For example, in quantum electrodynamics the partons are bare electrons 
and bare photons. Imagine this theory to have a Hamiltonian H which may 
be separated into two pieces H = Heree + Ay nt? one to represent free 
partons and the other the interaction between them. This H is, at first, 
expressed in terms of creation and annihilation operators (a*, a) of these 
partons (or, if you prefer, local field operators of the parton fields in 
space). Next, for special application to collisions of extremely high 
energy, W, incoming in the z-direction in the center-of-mass system, only 
the operators of finite transverse momenta (i.e., x, y components) are 
kept as W-co. The Ones with positive z-component of momentum of order W 
(i.e., Po = x W, where x is a finite quantity as we take a limit as 
W- 00) are separated from those of negative z-component. The first are 
called right movers (ap) and the second left movers (a,)- If this is 
done, and the Hamiltonian reexpressed, we get H = He + H + Hy where HR 
is the Hamiltonian involving right movers (a, ap) only (containing, of 
course, interaction terms among these right movers coming from H 


int )s 


* 
Hy involves left movers (a, Py ay) and Hy contains both a, and ars and 
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represents an interaction between the objects moving the left and those 
moving to the right. But Hy as W- a, becomes a very simple expres- 
sion (depending on the theory of course). For example, a collision of a 
right moving proton with a left moving neutron yielding right and left 

moving particles can then be analyzed in a simple way. The proton is an 


eigenfunction of Hy» the neutron of H Neglecting Hy» the system of 


L 
states before and after collision are complete eigenfunctions (not simple 
partons) of Hp + Hy. The operator Hy makes the transition between them 
(not necessarily in first order, of course). 

I leave to a later publication a more detailed description of how 
this might be carried out, but here I need only make some remarks about 
the variables on which things appear to depend in the limit as W~™ om. 

To describe a proton of momentum wo? energy Ey ordinary field theory 
gives a state function or wave function giving the amplitude that a number 
of partons of 3~momenta Py Fo» « - » etc., are to be found in it. The 
total momentum of these partons {f Fy equals Bo the momentum of the 
proton, but their total energy DE, (where each energy Ey is calculated 
from the mass 1, of the parton ia BE, = jt, +P, ° P,) is not equal to 
that of the final proton E 0° In fact, the amplitude to find this state 


contains, among many other factors, one which is inversely proportional 


to this energy difference 


A ~ (E xe, o> (1) 


Knowing this wave function completely for some Py? say at rest, how 
can we find it at some other momentum? It is very difficult to do, and 
in fact it requires knowledge of the entire Hamiltonian operator H, for 


the wave function is not a relativistic invariant. This is emphasized 


by the point that the momentum is the sum of the momenta of the parts but 
the energy is not. The wave functions that should be useful for us are 
those in which Bo is very large in the z-direction and finite in the 
directions perpendicular to that. If we take Pe =X, W and measure 
the parton's momentum in the z-direction in the same scale P, =x 


iz i 
then the wave function has a definite limiting formas W-o for Xo 


W, 


x, finite. (x, of course, is arbitrary; it may be taken to be unity, 
for example.) We have 


xX, = Ux; a (2) 


let the components of momentum perpendicular to z be called Q, a two- 


dimensional vector; then 


Q, = 22 : (3) 


0 
Finally we can see how A varies (insofar as its denominator behaves) 


by writing it as 
A ~ ((B,- P) - 2 (B, - P,,))* 
fe) oz 4 i iz 


which equals (1) since the total z-momentum ig the sum of its parts. 
However, for large z-momentun, 


1/2 
ESf. = (m> +P? 


2 
+ P*) 
2 2 
~ 2, (22) 
ow x 


where m is the mass of the particle. Therefore, the amplitude becomes 


-1 
a? # a? n° ‘: "ig 
A ~ ew||———] -: |-—_— ‘ (4) 
x X. 


{) i i 
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When the numerator as well as the denominator are expressed in these 
variables x, Q only a simple power of W appears in A which can be removed 
by a renormalization of the scale of PB in phase space integrals. The 


eonelusion that I wish to remark for our present purposes is: when 


expressed in terms of Q, the transverse momentum in absolute scale, and 
x, the longitudinal momentum in relative scale, the wave functions have 


definite limits as W, the energy scale of the longitudinal momentum of 
* 
the state goes to wm. In this limit the values of x are positive only. 


They represent the fractions of the momentum Xo which each parton has 


(thus, if x, = 1, all x, run from 0 to 1). The reason x, must be posi- 


i 
tive is that if a parton has a negative momentum PB = Wx with x negative 
or P= |x| W the energy E, is approximately +|x| W so E- P is 
2 |x| W which is of order W larger than the E- P of positively 
moving partons. Appearing in the denominator of amplitudes, such ampli- 
tudes are of order 1/w° smaller and vanish in the limit W infinite. 

It is seen that this rule (that x be purely positive and also that 
the amplitude depends only on x) is not valid in an asymptotic way (for 
W-+o) when |x| is as small as order c/W. The masses and transverse 
momenta seem to be of the order of 1 GeV so we shall call an |x| of order 
(1 Gev)/w a "wee" x. For example, for P, = W= 100 GeV in the center~ 
of-mass system (taking X, = 1), x= 1/10 is a small x, but it is not wee; 
a@ wee x would be x = 0.005 say. The behavior of the amplitude for wee x 


is not without interest, as we shall see, but for the present we shall 


discuss only x which is not wee. Then the amplitude as W—~ om appears 


* 
The statement is not precisely correct. What is meant is the density 
matrix has definite limits. 
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to be a function of the Q's and x's of the parts. 

In a high-energy collision, the initial state consists of one of 
these groups of partons moving to the right interacting with another 
similar group moving to the left. What is the interaction? It will not 
be enough to just naively say that one parton has a cross section for 
colliding with another, for, in field theory, interaction is represented 
by mediation of a field; in fact, by the exchange of just another parton 
coupled by the piece Hy of the Hamiltonian. But this Hy in its form, is 
not entirely independent of the form of A, for they both come from 
operations on the original Hamiltonian H. Thus the amplitude to inter- 
act via the exchange a parton is closely related to the amplitude that 
there is some parton in the right moving system in the initial state that 
can be "mistakenly" considered as really being a parton belonging to the 
left moving system. (Just as two electrons interact in first order by 
exchange of a photon, so it is also true that a right moving physical 
electron of X= 1 has a first order amplitude to be an ideal electron 
of longitudinal momentum l-x (times W) and a photon of momentum x. 
Interaction results if we consider that the left moving electron (insofar 
as it is bare) has some amplitude to be a left moving outgoing physical 
electron if we make it up of a left bare electron and the aforementioned 
photon.) However, a parton of momentum xW to the right would be moving 
backwards in the left system and would have practically no amplitude 
(as W—-) to be "mistakenly" considered as belonging to this left 
system. This is true, of course, only if x is not wee (of order 1 Gev/W). 
If x is wee, a right moving parton and a left moving parton are very simi- 
lar in appearance. Thus interaction occurs only through exchange of partons 


or systems of partons of wee longitudinal momentum. 
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The energy dependence of reactions thus depends upon the probability 
of finding wee partons of a certain nature. A great deal of information 
on wee partons can be gotten from a knowledge of the partons where x is 
not wee, but only small. For the small x and wee x behavior must join in 
a continuous fashion. For example, suppose, for small x the amplitude to 
find a parton system with x small varies as x ax where x is some con- 
stant (@ <1). Then the amplitude to find a wee parton of momentum 
~1/W in dx would, to fit on, have to vary as (1/w)%, but the range of 
x that such wee partons oceupy is of order 1w so that the amplitude to 
find a wee parton must vary as wee if W is the momentyum of the right 
moving object. If El is the energy of the right mover, this amplitude is 
(ge If this is to exchange with a similar system moving to the left 


the amplitude that this parton system is acceptable to the 
a-1 
>) 


with energy Ey 


other system is (BE . The amplitude for exchange therefore is 


ey Cy os vaties a6 8° ‘since a’ E, Ep. The cross section 
(there is always a problem of convention of the normalization of ampli- 
tudes) varies as the square of this, or geo? A constant cross section 
means Q@= 1 or the amplitude to find partons of small x must vary as 
ax/x. The amplitude to find a wee parton is not ua dx/x because this 
ax/x law fails below 1/W. The amplitude to ae wee parton is just 
constant, independent of energy since the curve 1/x cuts off for x below 
1/W at a value of order W and the integral below x = 1/W is finite in 
this event. Since cross sections (such as the total x section) are con- 
stant, we see that this must actually happen. It is, therefore, not 
strictly true that as W-+a if we keep all x and Q constant there is 


a definite limiting wave function (as we said earlier) for there is always 
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a finite amplitude for wee partons. However, the finite x part of the 

probability distribution of partons has a definite limit, in this limit 

there are probabilities of finding partons varying as ax/x. The 

apparently diverging character of this distribution for small x is eut 
(that is at x of order 1/W). 

off at wee x / (A complete theory would have to describe this cutoff 


region in detail. We shall say more about it later.) 


The equations for x not wee simplify if one concentrates on the 
small x part. It is then seen that there is an approximate scaling law 
for small x (the approximation improving as x decreases) so that solutions 
with special distributions of partons with a power law scale dependence 
(x) are eigenfunctions natural to field theory. 

It may help to give a few, nearly trivial examples. First, accord~ 
ing to first order perturbation theory in the expression (4) for the 
amplitude, the numerator does not depend on x, Q for scalar partons 
(couplings involve no momenta). If one of the partons has an especially 
low x, the term (> +Q°)/x belonging to it dominates and we get an 
amplitude proportional to x (times the scale dP/E, or ax/x, of relati-~ 
vistic phase space). This corresponds to @=0 for the scalar meson. 
Likewise, it can be shown that the amplitude for (longitudinally polarized) 
vector partons varies as constant (times dx/x). In this case a factor 
1/x comes from the numerator couplings. For spin 1/2 particles coupled 
ini Shes Gtagaleat wave; “the Semlbae Varies ae at’? (times Gee) ta 
general, Q@ equals the spin of the particle. In perturbation theory, these 
agree with well-known results for the energy dependence of x sections, in 
particular that vector meson exchange as in electrodynamics lead to con- 


stant cross sections in perturbation theory. 
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The deep inelastic behavior of the electron-proton scattering has 
been looked at from the point of view described here. It can be argued 


* 
that the curve of vs. Q°/oMv (in the variables of Bjorken”) 


Bw 

e Wp 
is the distribution in x of charged partons (each weighed by the square 
of its charge). A behavior like ax/x for the low momentum partons is 


indicated by experiment. 


II. REGGE BEHAVIOR 

By"Regge behavior" is meant the second item of our list of regulari- 
ties, that the cross section for exchange reactions vary as an (inverse) 
power of the energy, which power depends On the momentum transfer. 

The original expectation Of Regge behavior were the results of a 
brilliant induction from Regge's non-relativistic studies by Gell-Mann 
and Frautschi. Now, however, I should like to consider it as an 
established empirical fact and to try (in this section) to understand 


physically why it should be so. 


Tet us consider a typical exchange reaction, for example, a charge 
exchange reaction like p+x«x ~n+ 0°, in which the x, x° are right 
moving, and the p, n left moving. The easiest view of this is that a 
negative charge has been exchanged from the x system to the nucleon 
system. The cross section falls about as st or according to the best 

oo oi ad 
estimates as s with a, = 0.43 or s°". It might at first be 


thought that an exchange via a vector meson such as a p would lead 


¥ 
SLAC publication No. 571. 


(as it does in perturbation theory) to a constant cross section. However, 
it is to be noted that an important current density (the 3 component of 

this current density 
isotopic spin) has been suddenly reversed. Initially (x; p) / had fast 
moving components of -l to the right, +1/2 to the left. Afterward (n°, n) 
it has O to the right, ~1/2 to the left. Although the total 3 component of 
isospin is not changed, a motion of a part of it (-l wit) is suddenly 
changed from right to left motion. In electrodynamics we are aware that a 
sudden reversal of electric current density induces a copious Bremsstrahlung 
-- a sudden reversal Of the direction of an electron carrying a photon field 
to the right, leaves the field coasting on to the right in the form of 
photons (and, of course, the new motion of the charge to the left generates 
left moving photon Bremsstrahlung). 

The hadrons may act similarly. These currents are of considerable 
importance in our present theories and in fact we believe there are 
particles (ep mesons in fact) strongly coupled to them. ‘Thus the strong 
current reversal in a charge exchange will tend to shed p mesons. Ferhaps 
we can guess some of the behavior of the high-energy inelastic collisions 
by working by analogy to Bremsstrahlung. 

In studying the pure reaction p+«x ~n+ 2° at high energy, we 
insist, first that a current be suddenly reversed and, second, that no 
Bremsstrahlung actually occur. This latter is because we insist that the 
reaction have only the two particles n, x in the fixed state. We can 
interpret the fact that in such an exchange the cross section falls 
(relative to the main behavior of the majority of inelastic scatterings 
-- for which a constant cross section is empirically more appropriate) 


as the energy rises, by the observation that as the energy rises it 
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becomes increasingly less likely that the current reversal can be accom- 
plished without Bremsstrahlung. 

The theory of Bremsstrahlung with strong coupling and with the 
"photons" of the field carrying the very type of currents which are 
sources of further Bremsstrahlung has not been worked out in detail. 
Nevertheless, we may boldly try to guess that certain analogies to elec- 
tromagnetic weak interaction Bremsstrahlung exist. Some hope for sense 
here comes from noting that many features can be seen from a classical 
view which takes h-0O so e” fhe is large. Therefore some properties 
are understandable both for e°/he large, and for e” fie small, may 
have more general validity. This is especially likely if we understand 
the reasons for them clearly. 

First, the spectrum of the particles in longitudinal momentum is 
ap /E (or dx/x for x's which are not wee). This is because the Lorentz 
contracted field is so sharp in z that the energy in it is distributed 
uniformly in P, (the Fourier transform of a pulse being a constant). The 
energy distribution of the radiated particle is therefore qP» or if the 
individual particles have energies E their longitudinal momenta are dis- 
tributed as ap, /E. Also, such a distribution is stable under further 
disintegration of the particles, or of interaction between the particles. 
If we write tanh w = P/E for z component of the velocity of a particle, 
the relativistic rule for the addition of velocities becomes, as is well 
known, simply the addition of rapidities. Suppose a particle in its rest 
system can disintegrate or yield a new particle with rapidity u. Then 
if the old particle has rapidity w, instead, the new particle appears with 


rapidity w'=w+u. Therefore, if the old particles are distributed 
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uniformly in w (as dw) the new particles appear with a distribution also 
uniform in rapidity because dw' = dw. Thus this feature (a spectrum 

dw = aP,/z) is to be expected independently of whether we are seeing what 
was originally radiated (say, p mesons) or are observing some other secon- 
dary particles that these may have changed into (e.g., if p's go into 
kions). Hence I believe we should expect it for our inelastic distribu- 
tion of particles of small (and wee) x in our strongly interacting systems 
also. (We cannot expect this to be valid for large x, say x= 1/2 also, 
because if that much energy is taken from the primary particle by radiation 
of One emitted particle, subsequent emissions are severely affected. The 
aP,/E spectrum for photons in electrodynamics is precisely valid only for 
smaller values of x.) 

Next, the energy in the field thus radiated is some fraction of the 
energy of the particle which radiates. Thus the particle may be found 
after the radiation to have lost on the average some fixed fraction of 
its energy. This is found experimentally, in some cases. For example, 
in pp collisions which yield a forward proton, its average momentum is 
about 0.60 of the incident momentum” (in individual collisions, its value 
fluctuates widely). 

For weak coupling electrodynamics, the vector field particles are 
emitted independently into a Poisson distribution with mean number n 
emitted. The probability that none are emitted is oa. The sum of the 
chance of emitting none, one, two, etc., (that is, the total cross section) 


is much like it would be without coupling to the photons. Here we know the 


¥ 
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February 1968, Turkot, p. 316. 
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total x section is constant, and so can try to interpret the energy fall- 


ea -2 
off s ¢ of the pure two-body charge exchange reaction as the factor 
an the probability of _ expected mean 
e for/no emission, where n is the/number of primary particles emitted. 


This multiplicity n must rise logarithmically with energy then as 

n= (2-20,,) jn s. The particles we observe are not, of course, the primary 
field particles emitted, but rather the observed particles are secondary 
disintegration products of these unknown primaries. But if each primary 
produces on the average a fixed number Of secondaries, we see that the 
expectation is that the multiplicity grows logarithmically with E. 

This is necessary if our various ideas are to fit together. Because 
we have already suggested that the mean number of any kind of particle 
emitted is to a, ‘ ax/x for small x and, for a given x, not to vary 
otherwise with the energy W of the collision (so that ec is a constant). 

The total mean number emitted, then,is ec f dx/x. The upper limit of x 
is of finite order (for the formula fails as x ~1 and x cannot exceed 1) 
but the lower limit is of order of wee x, (i.e., order 1/W) where the 

to the right 
dx/x fails. Thus the mean number emitted/must vary as c(fn W + const). 
(Actually we can do the integral all the way to zero, for we expect the 


integrand to be ec ap /\u + a + pe where » is the mass and Q is the 


transverse momentum of a typical particle. Putting i = xW, this is 


x. 
- ax OW xy 


J FE =e § | -3 778 
° : x? 4 (n+ @2)/a er Ge aq le 


for finite x): To go further, we should have to know the transverse 


momentum distribution. ) 


The one respect in which the electrodynamic analogy leads us astray 


is in the transverse momentum distribution of the photons. These large 


aP ae 
transverse momenta fall off slowly (like a a8 for sudden current 
Q 


reversals) but in the strong collisions this is empirically not true. 
Some characteristics of the distribution of charge across the face of 
the interacting particle is involved here. 

I have not yet studied the regularities involving the transverse 
momenta (items (4) and (5) in our Introduction) from the viewpoint being 
developed here. In the meantime, we can take these as empirical facts to 
be included in any expectations. In the same way we leave for further 
research strangeness and isospin character of these effects. We should 
notice, however, that, although we discussed a charge exchange arising 
from an exchange of a particle of the quantum numbers of the p , the 


exchange of any current of the usual octet would have analogous effects 


on the possible Bremsstrahlung of particles coupled to other (non-commuting) 


currents. 

In a pure two-body exchange reaction, since the probability of not 
Bremsstrahlung depends on exactly what currents are reversed, the value 
of a, will, from this point of view, depend on the quantum numbers of the 


particle exchanged (which, of course, it does). The a, here referred to 


is evidently only the largest for a given set of quantum numbers exchanged. 


The quantum numbers exchanged may involve not only currents of unitary 


symmetry, for baryon number (and possibly spin) may be exchanged, and we 
do not know if there are special couplings to baryon currents (or spin 
currents) which are also involved in determining a, However, I should 


like to hazard the guess that baryon number cannot be exchanged without 
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the transfer of a fundamental part of spin 1/2. Such an ideal part already 
has Q@=1/2 which would imply a 1//s behavior of amplitudes before 
corrections to Bremsstrahlung. Therefore, if we do an experiment which 
freely allows the emission of wee mesons, except that the quantum numbers 
are controlled so that a baryon must be exchanged between right and left 
systems this cross section probably approaches a 1/ s behavior, instead 
of the constant expected for similar experiments in which no baryons need 
be exchanged. 

A final question is that of the distribution of correlations among 
the various emitted particles in the average inclusive collision. In 
the perturbation theory these are emitted independently and at random in 
a Poisson distribution so the probability that there will be k of them of 
momenta X1» Xp» + + + x, is just (cax,/x,); (cax,/x,), aera (cdx,/x, ) e P/x! 
where n is the mean number emitted. I have not yet found a good reason 
whether this would be true in our non-perturbative case or not, but if one 
insists on comparing the experimental distributions and correlations to 
some theory, perhaps this is the first thing to try: that the pion distri- 
bution results from an original Poisson distribution of p's, each distributed 
for small and wee x as c ap /E,, with c near 1.1 or 1.2 (c is 2-2a, for 
the p trajectory). In fact, if we suppose two pions for every p, the 
multiplicity of p's would be ec én s, and the multiplicity of the pions 
twice this, or about 2.3 én Erapicev)- Surprisingly, this fits observations 
very well (see Table 1). As an additional coincidence, this value ec is what 
one gets from perturbation theory if one uses the coupling constant deter- 
mined for p nucleon coupling. It must be admitted that in this paragraph 


we have gone much further than we should -- for our precise numerical result 
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depends upon a choice of which particles are fundamental and which they 
disintegrate into. All our other predictions were of those features 
which were independent of such specific choices. 

The probability that the total momentum of all the emitted right 
moving p's is less than y is proportional to y~ so that (aside from dif- 
fraction dissociation) the momentum distribution of the ongoing particle, 
when it takes a fraction of momentum x close to 1 should vary as (1-x)° 
where l-x is small. 

In Section I 

/ we discussed the longitudinal momentum distribution of partons 
expected in a hadron wave function, but we have not seen how this might 
lead us to expectations for the distributions of momenta of real hadrons 
in a collision, for partons are not real hadrons. Nevertheless, we shall 
suppose that when a hadron is disturbed via interaction, so that its 
distribution of partons is no longer exactly that of a single real hadron 
state, it must be compounded of a series of real hadron states, but the 
distribution of longitudinal momentum of these real hadrons is qualita- 
tively like those of the partons which we described before (in Section I). 
I have no way to see why this must be true, but the features of the dis- 
tributions discussed in that section seem, firstly, to rely mainly only on 
qualities of relativistic transformation; secondly, the principle is right 
in perturbation theory; and thirdly, the results of assuming this fit 
very well with the qualitative predictions of the Bremsstrahlung analogy. 
Finally, for one reason or another -- empirical or theoretical, good or 
bad -- I suspect that the high energy collisions to have a number of 


features which we summarize in the next section. 
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III. EXPECTED BEHAVIOR AT HIGH ENERGY COLLISIONS 
between two hadrons each of momentum W, 

In a collision of very high energy / in the center-of-mass systen, 
the outgoing particles of the collision should be described by two 
variables Q, the transverse momenta in absolute units and x the longi- 
tudinal momentum as a proportion of W (so that Po = xW). We intend to 
describe cross sections for various processes as W increases without limit, 
keeping the x, Q's of the particles constant. If for large W an x is of 
order - es that its momentum in the C.M. system is in the BeV range 
or less) we say the particle has a wee momentum. Small x simply means a 
value of x much below one, but higher than order : aloe extremely 
large W. Finally we should like to characterize experiments as being of two 
classes -- exclusive and inclusive. 

An exclusive experiment is one in which it is asked that only 
certain particular particles of fixed Q, x, and character be found in the 
final state, and no others. In particular, it excludes the emission of 
particles with wee momenta in the limit. Examples are: two-body reactions 
AB -— CD; an experiment which uses missing mass to try to select events 
which are virtually two-body reactions, etc. For such Seer a the 
eross sections should fall off inversely as a power of / the power being 
2a(t) - 2 where a(t) is the a appropriate to the highest Regge trajec- 
tory capable of carrying the necessary exchange of quantum numbers between 
the right moving and left moving system, and where t is the negative square 

With these x, Q 
of the transverse momentum which must be exchanged. ( / these variables fixed, 
inversely as W 

the longitudinal momentum transfer and the energy transfer go to zero/ as 
W- oo.) In the special case that no quantum numbers need be exchanged, 


the cross section is constant (empirically, at least, if +t = 0); it does 


not fallas W-a. This phenomena is called diffraction dissociation, 
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and is sometimes expressed as the exchange of the "Fomeranchuk trajectory". 

An inclusive experiment is one in which certain particles are looked 
for at given Q, x, but one also permits any number of additional particles. 
More precisely, it does not in any way exclude the emission of arbitrary 
numbers and kinds of particles with wee momenta. Examples are the total 
inelastic cross section, the mean number of A mesons emitted with momentum 
x in range dx, the probability that no single particle is moving right nor 
left in the (center-of-mass system) with more than 1/ 2 the original momen- 
tum W, etc. Such cross sections should approach constant finite values as 
Wao. 

with small x 

The mean number of mesons of a given kind formed/in a high-energy 
inclusive experiment should vary as dx/x. This should extrapolate right 
through the wee x region in the form ap /E where E is the energy 
te + 9° + og of the meson of momentum P,, at fixed Q. (This suggests 
that more appropriate variables for the small x region, would be w, Q 
where w is the z-component rapidity w = tanh72 (P/E). The distribution 
should be uniform in dw (for each Q) and ultimately independent of W.) 

As @ consequence, the multiplicity of a given kind of hadron should rise 
logarithmically with W. 

It is this dx/x distribution with its logarithmically divergent 
character for small x which makes it possible that the probability of 
finding any specific set of particles with given x, Q values (except the 
elastic or diffraction dissociation ones) falls with energy as a power, 
and yet the total cross section can be constant. 

In an inclusive experiment of A colliding from the right, with B 
from the left, the probability that some particle C comes out moving to 


the right with an x Pics tovimity should wasyoas, Go 


518 


as long as (1 -x)1/2 


is not wee. Here a(t) is the value appropriate 
to the trajectory of highest a (excluding the Pomeranchon) which could, 
upon emission, carry away the quantum numbers and transverse momentum 
needed to turn A to C. 

In a special kind of partially exclusive process in which a baryon 
must be exchanged to get the reactants of finite x, but no wee baryons 
appears among the particles of wee momentum, then I believe the cross 
section will vary as fw but this is not on as firm a basis as the 


other suggestions. 


TABLE 1 


Multiplicity of Pions in High Energy Collisions 


2.3 ne 
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Richard P. Feynman 


Department of Physics 
California Institute of Technology 
Pasadena, California 91109 


Many of us are working on the same problem, which is to 
understand the behavior of hadrons, the strongly interacting 
particles. Among the various attempts to understand it in 
the past decade, there is a recent one that goes under the 
name of "partons". It sounds very mySterious, but I will try 
to show you that you knew it all the time, you just called it 
something else. Work on partons has been done by myself, 
Paschos, Drell, Bjorken, and others. 

Almost everybody supposes that the strongly interacting 
particles obey a number of principles like the quantum-mechan~ 
ical superposition of amplitudes, relativistic invariance, uni- 
tarity, and analyticity in some form or other. The problem is to 
*An invited talk presented at the Symposium "The Past Decade 


in Particle Theory"at the University of Texas in Austin on 
April 14-17, 1970. 
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make a theory which is consistent with all of these princi- 
ples at the same time. It would be very convenient if we 
could write our equations in a mathematical form such that 
we do not have to keep checking them for unitarity, relativ- 
istic invariance, quantum mechanics, or something else. We 
would then not get new equations if we add some of these 
features, for the special mathematical form would automati- 
cally have all the equations in it. The thing to do is to 
find a model which satisfies all of these conditions simul- 
taneously. The only model that we have is quantum field 
theory. The idea is to take quantum field theory, despite 
its divergences and difficulties, and try to see what kinds 
of mathematical formulations it would have in it. We will 
start with something which satisfies all the conditions and 
later on perhaps we can modify some parts. 

What is implied by quantum field theory? Of course you 
would have to make for field theory a specific model, yet in 
every quantum field theory there are fundamental fields and 
operators which create and annihilate some kind of particles 
that underlie the theory. We need a name for the objects 
which the fundamental operators a* and a create and annihi- 


late, objects that are not the final states of the system, 
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such as a complete proton, which may have parts inside. We 
need a name for what one would call the "elementary particles" 
of the theory. We do not know if there are any such particles 
in the end, but we will start by supposing that there are be- 
cause otherwise we would have no field theory at all. I will 
call these things "partons". In the case of electrodynamics, 
for example, the partons are the ideal, what one sometimes 
calls "bare", electron and the ideal proton. The creation 

and annihilation operators create and annihilate these. But, 
a physical electron is an ideal electron with some photon 
field around it, and is therefore a combination of various 
partons. 

What does a field theory look like? In quantum field 
theory, instead of representing things by scattering ampli- 
tudes and Regge poles, we use a wave function, and it gives 
the amplitude for finding a parton moving this way or that, 
or two partons going in a certain direction, etc. So we 
shall see if we can get anywhere by discussing what the char- 
acter of the wave functions of the protons, the neutrons, 
the pions might be, and shall try to represent the experimen- 
tal results in terms of properties of the wave functions. Ul- 


timately, maybe we will see some special properties of the 
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wave functions and perhaps get some idea of what the partons 
are, whether they have spin one-half, or they are quarks, if 
in fact they are anything. Also, maybe by using the wave 
function, one might be led, psychologically, to suggest a 
certain principle or find a general proposition that can be 
deduced outside the realm of abstract field theory, without 
talking about the objects inside. For example, one might be 
able to deduce it by looking at the commutation law of cur- 
rents, or something like that, which does not say anything 
about the underlying machinery. In such a case one would 
establish something that is more satisfactory than if it 
were model dependent. But, it does not bother me too much 
if, at first, it is model dependent. 

The characteristic of a wave function is that if you 
have an amplitude for finding a lot of particles or pieces 


around, the amplitude is given by 


where Be is the state energy, say of a proton, and ES are 
the constituent energies of the partons inside. The numer- 


ator is some kind of matrix element. Now an interesting 
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thing about a wave function is that it does depend on the co- 
ordinate system in which one looks at it. It is an unrela- 
tivistic idea, really, because it cuts space-time at a given 
Moment and asks, "What partons do you see?" If, say froma 
Lorentz transformation, it cuts space-time at a different 
angle, one would see other combinations. If one has never 
tried these things, the fact that the wave function is not 

a relativistic invariant may be bothersome for a moment. But 
you know that the sum of the momenta of the partons inside 
the wave function is always the Same as the total momentum of 
the state, and that the momentum is conserved, but that this 
is not so for the energy. It is the lack of energy conserva- 
tion which gives the strength of a certain amplitude, so it 
is not relativistically invariant. 

It is not easy to transform a wave function, becauSe one 
has to know the Hamiltonian. Therefore, there may be one 
system or another in which it is most convenient to look at 
the wave function and the usual suggestion is to look at the 
wave function in the rest system of the entire state. I 
prefer, instead, to look at the wave function in a system in 
which the particles are moving very fast, all extremely rela- 


tivistic, in fact, in the limit. Thus, I would like to look 
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at the wave function of a proton when it is moving extremely 
fast in the z-direction. 

Now if such an object moves very fast, then in order to 
understand what happens it is convenient to rewrite the same 


expression as 


N 
Anp = ————_-___—-_ 


(E-P ,) - 2(B,-P,)) 


since ye, = P where P is the center-of-mass momentum in 
iz 9z oz 
the z direction. When the thing is moving at extreme veloci- 


ties in the z-direction, let me suppose that 


For example, W may be the energy in the center-of-mass of the 
original state, in which case X, would be one. Or W could 
just be a scale, and we could leave x, open. Then the energy 
of a particle (denoting the transverse component of a particu- 
lar particle by Q;) is 

E. = P. + Q,° +M, - x Wo+ Q, + M, . 
It has appeared experimentally in collision after collision 


of hadrons with each other, that the side-ways momenta are 
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all limited -- they seem to be of the order of 340 MeV. As 
the energy of the collision rises to the higher and higher 
Gev's, there is no increase in the transverse momenta of the 
outgoing particles in the collision and they are finite. I 
am therefore going to guess that inside the wave function, 
it is also true of the partons. Therefore, as W goes to in- 


finity, Q stays the same. Let us try it. Using 


2 2 
~MitQ. 
E- P= 33, - we get 


with Ix; = x. Now it turns out that as you take the limit 
for certain matrix elements, 2WN falls and the amplitude of 
those states is very low. But for various states 2WN ap- 
proaches a constant, at most, and therefore, interestingly 
enough, the wave function has a definite limit as long as 

one stays away from x's near zero. That is, if you go to in- 
finity with W, everything will be all right if the x's remain 
finite. There is a small technical point to straighten out 
as to what happens as x approaches zero. 


However, we have already discovered something interest- 
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ing. It is that if we represent transverse momenta in abso- 
lute units, and longitudinal momenta in the scale of the 
energy of the collision, the wave functions that come in are 
invariant as W goes to infinity. This leads to a suggestion 
about the energy behavior of highly inelastic collisions in 
general, namely, that expressed in terms of the variables 

Q's and x's, the laboratory experiments might approach invar- 
iance at extremely high energies. 

A little more can be said about the wave function. I 
want to represent this fast moving photon by a number of par- 
ticles moving at different speeds, sharing the total Xo! the 
total longitudinal momentum in scale, and let us say x, = l. 
All these particles share the total momentum of the system, 
How about some of them becoming negative? No, they cannot! 
If x were negative, when I take the square root of E, which 
has to be positive, it would be -xwW, and so for particles 
going backwards, E-P, is very large, not very small. It is 
the E-P. in the denominator that makes the amplitude of that 
term negligible. So the wave function of a fast moving had- 
ron consists of a lot of little particles moving forward, 
sharing the longitudinal momentum and having a finite, con- 


fusing, transverse momentum, 
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I have tried to use this concept of the wave function to 
understand a number of phenomena. I would ultimately like to 
be able to understand, as much as possible, all the charac- 
teristics of the things that we observe from such a point of 
view, such as where the Regge expressions come from, etc. So 
far, I have had but a limited success. However, even the 
crudest situation leads you to certain ideas. Consider the 
elastic form factor of a proton. That means that we hit the 
bunch of forward-moving partons transversely with a tremen- 
dous momentum by a photon, and ask with what amplitude do we 
get a proton back again. It would be represented by a num- 
ber of partons moving in the direction of the final proton. 
Now the electromagnetic field, I am going to suppose, inter- 
acts with only one parton at a time, because most field 
theories have a propagator for the parton obtained by the 
substitution p+p-eA. So what we are really asking is this: 
Suppose we have a number of these partons and we kick one of 
them sideways, what is the probability that the rest of them 
look like they were moving in the final direction? Now, for 
each of these initial partons there is a certain amplitude 
that it was not moving exactly in the forward direction, that 


it had a little transverse momentum, and there is some ampli- 
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tude for the final proton state that its corresponding partons 
look the same way, so there is an overlap integral between the 
two. But the higher the momentum transfer is, the harder it 
is to get the overlap, and, therefore, the form factor should 
fall off with Q depending on how likely it is that the non- 
kicked partons can be seen to be going in the final direction. 
I would guess that there is a kind of universal function for 
such problems of high momentum transfer. There is, of course, 
a matrix element for the photon on one parton that happens 
not to be Q dependent, but the main Q-dependence, the rapid 
fall-off, must come from trying to get the fast moving par- 
ticle going in one direction to look like a very quiet simple 
thing, a proton. 

Since we know experimentally that this form factor falls 
off toward zero as Q goes toward infinity, perhaps as o°; 
we learn immediately that there is no amplitude to find only 
one proton-like parton inside of a proton. Because if there 
were, there would be a certain finite amplitude, A let us 
say, to find a proton consisting of a proton-like parton all 
alone. Then when you kick it, it would be a deflected proton- 
like parton, and the amplitude of that for the proton is 


also A, so a? would be the ultimate limit of the form factor 
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as Q goes to infinity. So we have to look for more compli- 
cated wave functions. This amplitude could be zero for 
several reasons, one of which would be that there is no par- 
ton in the fundamental field theory which has the same quan- 
tum numbers of charge and spin as the proton. Or it may be 
for reasons somewhat like the fact that the electron is never 
found with absolutely no photons around it. There is always 
a field around it. But I do not think that this case is 
realistic here because in QED it depends on the zero mass of 
the photon. 

The next application which I want to describe is called 
“inelastic electron-proton scattering." In this experiment, 
we take a proton and hit it with an electron, and there is 
an exchange of a photon. There is also a small correction 
for the exchange of two photons, but that is removed by the 
theoretical people who analyze these experiments in order to 
have an interaction of first order in e” between the electron 
and proton. We can control the variable q? = eae and the 
variable Py * q = Mv. The momentum transfer, q, is space- 
like, and v is the energy loss of the electron in the proton 
rest system. M is the mass of the proton. In principle, we 


can measure the probability amplitude to have different final 
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electron 
7 


stuff ¢ 


proton 


Figure 1 


Inelastic Electron-Proton Scattering 


states as a function of these two variables. In fact, how- 
ever, at the present time, without looking at the final 
states of the proton we observe the total cross section by 
looking only at the final electron. So we have to discuss 
the total probability that a proton, being hit by a photon, 
absorbs a certain energy and a certain momentum. 

First, we will describe the experiment. It turns out 
that the differential cross-section for inelastic electron- 
proton scattering can be expressed in terms of the angles in 
the laboratory and two functions, WwW, and Whe which are funec- 


tions of 9? and v. 
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dc e 


4E'*sin*6/2 
Here E' is the final electron's energy. The amplitude for 
the scattering is 
(uyy uy) 


Amp = 41a —“>———- (£] 3, (a) |P) , 
q 


With the known factors all factored out, we can express 


everything in terms of the quantity 


Ky = 2 (lij(-al£ (fli (@|P), vro. 
f u 


From relativistic invariance, we have 


2 2 
Ky (@ = Fora td , Vv) ona , vV)> 


where both W. and Wy are positive, and 


2 
( 5) 
l+—s5]W,2W . 
2 250 51 
Q 


Furthermore, Since they are both positive, there is another 
relationship. 

Because the angles used were small, the results are 
Sensitive only to Wo: and we will plot the complete result 
as though we were plotting Wo, although this is a little bit 


erroneous. Let us look at the complete scattering, first at 
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Figure 2. Inelastic e-p scattering cross 
section versus missing mass. 
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low values of momentum transfer and energy. 


The missing mass is simply another way of expressing the 


variable v, 


K 
" 


2 
£ (p+q) 


WwW + 2M =O" 


But it measures the total four-momentum squared of all the 
products added together, such that if the object which is 
made in the collision is not a lot of pieces, but happens by 
accident to be one little ball of slightly excited "goop", 
there will be a resonance if that "“goop" has a definite mass. 
In Figure 2 there are the famous resonances like the 1238 
and there is a 1535 and something higher. The resonance at 
1410 does not appear but it is very interesting that these 
resonances appear so beautifully. 

Now if we hit the proton a little harder, it will be a 
little bit harder to get all those pieces to come back to- 
gether again to form a resonance. We should still see the 
resonances but without so much glory. 

In Figure 3, as expected, the resonances are much weaker 
and there is a rather large smear in the back which is some- 


times called the "deep inelastic scattering." It is obvious 
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Inelastic e-p scattering cross 
section versus missing mass. 
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what that is: that is when you hit the proton and the pieces 
do not stay together, they just fly into a lot of pieces, but 
the chance that they are resonances becomes less and less. 

Incidentally, it is very interesting that the way these 
resonances fall as 9? increases is almost exactly the same as 
the way the elastic scattering form factor goes out. There 
is a certain amplitude, that is not on these curves, which 
should be represented by a 6-function at exactly the mass 
squared of the proton, but that has been taken off so as to 
make this curve look good, otherwise this curve's elastic 
term is very big, and the size of it is the elastic scatter- 
ing, which is the ordinary form factor. The funny thing is 
that all these fall off in proportion to the elastic form 
factor. In other words, the probability that the thing holds 
together to form a proton and the probability that it holds 
together, as you give it a higher momentum transfer, to form 
some excited state is a ratio roughly independent of 97. As 
9? gets harder and harder, it is difficult to line things up, 
but if they are more or less lined up, they might as well be 
an N* as a proton. 

Now let us consider deep inelastic scattering. Plotted 


together on Figure 4 are the data for several different 
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momenta and energies as a function of v/Q". What is plotted 
vertically is not Wo, but vWo, and strictly speaking it is a 
little bit mixed up with W- The way that you plot it de- 
pends upon how much you assume WwW is. If Ww) = 0, you get the 
upper curve, in which the dots do not fit on top of each 
other, while if Wy is at its maximum value, then you get the 
curve marked R = 0 which is more or less a constant. What 

is interesting is that vw, plotted against v/Q? seems to be 
approaching a universal curve so we should like to explain 


why that is a natural thing to expect. 


7 electron 
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Figure 5 


Parton Scattering Out of Incoming Proton 


As before, in Figure 5 we represent the proton coming in 
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by a lot of partons. Now one of the partons is hit by an 
enormous momentum. The proton is moving along fast, but the 
sideways momentum from the photon is also very great and it 
knocks the parton off in a crazy direction, and leaves the 
other partons undisturbed. It might appear that the partons 
are all interacting with each other but they are not. When 
a thing is moving slowly, there is a lot of interaction be- 
tween the parts. When the thing is moving very fast, the 
Lorentz time shift makes all the interaction go like a slow 
clock, and therefore the momentum increases so fast that it 
does not have much time to find out what happened. That is 
to say, we do not have to worry about the interactions ahead 
of time, nor do we have to worry about them very much after- 
wards, because when you sum over all the final states of the 
system there is a principle of completeness which says that 
the sum over all the final states is the same, whether they 
interact or not. (This is not absolutely true because there 
may be a shift in the energy. And when we insist experimen- 
tally that there be a given loss v, it may not be exactly the 
same as the loss of a single parton because its energy may be 
somewhat changed by the interaction. At any rate, we can 


deal with these objects as free particles.) 
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The probability for this process involving the parton is 


£(x) dx 


= 2. 2 
Prob = oe 2p 2p, 6 ((p+q) ?, Say p) 


where f(x) is the momentun distribution of the partons inside, 
each weighted by the square of the charge that it carries 

(in units of the electron charge), and x is the fraction of 
the z-momentum of the incoming proton that the parton has. 

The scattering probability for a photon on a parton is the 
"square" of the current, os A if the partons had spin 

zero; and for the other cases, you get more complicated 
things. But if you just look for Woe you only need this 
factor. The delta function is over the square of the four 
momentum before and afterwards. Let us Suppose that after- 


wards it is equal to approximately eee So expanding, 


Prob = 


2£(x 2 
otis as & (q7+2xP -q + transverse momenta, etc.) , 


where we are taking Pi = xP for the whole four-vector, so 
our notation for the p's is a little inconsistent. Inside 
the delta function we have a term q? + 2xP + q, where P is in 
the z direction, plus some transverse momenta squared. 


There is an uncertainty as to the exact form for these extra 
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terms and there may be little energies of interaction of the 
final object with the original object, so there are finite 
errors here. However, we are going to take the limit as 9? 
goes to infinity and as 2Mv goes to infinity. As the energies 
and Q's get higher, we get deeper inelastic scattering, and 
the contribution of these finite terms to the delta function 


becomes less, and we have 


2P PY 2 
Prob = 5 fxst-o + 2Mvx) £(x) dx 
PP 
wv il 
= -EM x f(x) , 
thereby getting, 
2 
a _ Q 
vw, = xf(x), x = om? 


which shows that ultimately, as 9? and 2Mv go to infinity, 


vW. is an invariant function independent of any variable 


2 
2 
except Q /2Mv. 

I can discuss the degree to which this should be cor- 
rect. It turns out that it should be much more correct than 
we would guess. The errors are the order of something over 
v, and then are only important if xf(x) varies rapidly near 


the origin. But xf(x) is constant at the origin, so if you 


misplace the value of x that you are looking at, it does not 
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make much difference to vWo, so I can say that this is a 
pretty good approximation. Actually, the energy that the 
things are measured at is not so terribly high, so it is ra- 
ther nice that the universality of vw, is already showing up. 
Now that I have explained the properties of the experi- 
ment in terms of the parton model, I would like to discuss 
the results of the experiment to find out what we know about 
the wave function for the proton. In the first place, we 
find that ww. is universal. What can we conclude from that? 
We can conclude that the charged partons are either spin 0 or 
spin 1/2. The coupling to a spin 1 particle increases with 
energy so rapidly that if there were spin 1 partons in there, 
the universality would be lost. We can even go a little bit 
further. If the spin were 0 for the partons, Wy is zero. If 
the spin is 1/2, then Wa reaches its maximum as expressed by 
the inequality we gave earlier. The question is, "What is 
wi?” Now, here we go into a circle depending on how energetic 
a theorist we are. If we are sure that Wo is universal, then 
we are also sure that the case corresponds to spin 1/2, be- 
cause that is the case in which the curve is more universal 


when you plot it. But if, for some other reason, the curve 


is not universal, you would be hard pressed to argue which 
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one is the right curve. There are going to be some more ex- 
periments at other angles and higher energies, that will make 
this a little bit clearer. At the present time, I would say 
that the experiment does favor the spin 1/2 and that, in 
principle, it could decide the question as to whether the 
charged objects which contribute to the current, the partons, 
are spin 1/2 or spin 0. I will conclude myself from what I 
have seen, that they are spin 1/2, and not spin 0. 

Now it is interesting to turn these statements into a 
universal language that does not depend on our model. This 
is always a good thing to try to do . The model is just a 
scaffolding to discover something, but it is not the house. 
The first question is: What does it mean that vw. is univer- 
sal? The Rig which we are measuring can be expressed direct- 


ly in terms of things that do not involve the field theory 


fxm exp (iq?x) déq = <P| [J (x), J. (0) ]|P> 5 


we are measuring the property of 


Therefore, in measuring BG 


the commutator of two currents' expectation for a proton 
(minus expectation for a vacuum). Everybody knows that when 
two points in space-time are separated space-like the commu- 


tator shall be zero, and we also know that when the two 
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points are time-like related, then the commutator is not zero, 
except in a trivial case. So what type of singularity is it 
as we sweep across from one region to another? What happens 
as we cross the light cone? According to the simplest theor- 
ies, there is a delta function type singularity on the light 
cone for the above expression. It turns out that for a single 
particle in perturbation theory, particles of spin 0 and 1/2 
have an ordinary delta function across the light cone, where- 
as particles of spin 1 have a gradient of a delta function. 
The proposition that vWe is a universal function is equi- 
valent to the statement that the singularity on the light 

cone for the expectation value of the commutator is a simple 
delta function across the light cone and involves no gradient 
of the delta function. That is, if <P| (T(x), J, (0) }|P> 


varies as 
PP g(t)= [8(r-t) - 6(r+t)] 
bv xr 


on the light cone, then 
VW? (x) = face @etMt at 


We can therefore forget about partons if we want to, and 
state everything entirely in terms of commutators. 


What does it mean that the parton has spin 1/2? Gell- 
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Mann, in writing his current commutation relationships as- 
sumed a kind of minimal electromagnetic coupling which is 
equivalent to the statement that the singularity across the 
light cone is the least possible. He also proposed that the 
commutation rules of the currents involve axial currents 
and that the axial currents and the regular currents commute, 
just as they would if they were coupled to a spin 1/2 object. 
So I think that with a little bit of luck and a little bit of 
fiddling around, I could conclude that the idea that the Ww 
and We are related in this limiting case for the spin 1/2, 
is related to the statement that the currents are part of a 
"thing" x "thing"; it does have to be SU(3), but could also 
be say SU(3) x SU(3). That means a coupling with i exists 
and that the opposite parity coupling, or the axial current, 
exists and has a coupling very similar to the normal coupling. 
Let uS go on to some other properties of this distribu- 
tion F(x). We have seen it plotted in the reciprocal vari- 


able, but I would like to plot x F(x) in the regular variable 


Q? 


~ 2Mv 


Figure 6, when suitably normalized, yields 


fe f(x)dx = 0.16 or £(x) ¥ oe 0.16. 
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The first thing it shows is that there is an infinite 
number of partons because the mean number of charged partons 
goes as dx/x, and so the total number of them in an infinite- 
ly fast-moving proton would be logarithmically infinite at 
the low end. That is perfectly all right. However, it is 
very interesting to take a simple perturbation theory and 
make a model, say of proton going along that disintegrates 
into a parton of spin 1/2 and another parton of spin 0. If 
we then ask what is the distribution of this parton of spin 
0, we will find that for spin 0, the distribution must always 
be xdx; for spin 1/2, dx; and for spin 1, oe, This is an- 
other representation of the angular momentum group and the 
angular momentum properties. For a single particle with 
nothing else around, perturbation theory gives the distribu- 
tion agen for small x where a is the particle's spin, 
and a higher angular momentum would be more singular. Now, 
here we have a little peculiarity because dx/x is what you 
get from a vector particle, and dx/x is what we found; but 
a vector particle is the very one that we could not allow to 
couple. However, our arguments were based on the erroneous 
idea that a single particle could be in the small x region 


and that it is all right to do perturbation theory. There 
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is an infinite number of particles for small x, and one can- 
not proceed by assuming a single particle and perturbation 
theory for the strong interactions. 

However, very much like the particles in electrodynamics, 
when an electron is moving fast the field is condensed into 
a narrow splat by the Lorentz transformation, and if we ana- 
lyze that in momentum space we get a uniform distribution in 
momentum space of the energy; the energy of each photon times 
the number of photons goes as dk/w. So the reason for the 
dx/x is this dk/w -- the field is squashed. (Incidentally, 
a looks like dk/w, and if the photon had a finite mass as k 
goes to 0, it would have a finite limit. So does dx/x. It 
doesn't look like a finite limit, because the x is on a momen- 
tum scale which has been increased by the factor W to an enor- 
mous amount. ) 

Now the next thing of interest is the integral, 
freon ax = 0.16. What is the normalization and what does 
0.16 mean? I can express it this way: Suppose for a moment 
that all the partons were charged. If they all had unit 
charge, the square of their charges is also 1.0, and then 
this integral would be simply the sum of the momenta of all 


the particles. It is so normalized that it would be the total 
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momentum of the original system and would be 1.0. It is re- 
markable that this comes out to be only 0.16, which is very 
small. For example, it seems to say that the number of neu- 
trals in the system is five times the number with charges 
which is a little hard to accept. It could mean that the par- 
tons have a non-integral charge. There has been a lot of 
talk about quarks, and the proton may be made of two quarks 
of charge 2/3 and one of charge 1/3, in which case that num- 
ber would come out 0.33, if all there was in the proton were 
three quarks. But, we just discovered that there must be an 
infinite number of partons in there. As Paschos has said, 
"yes! Those are millions of pairs of quarks and antiquarks." 
So what we should do is take the statistical average of the 
2/3 squared, the 2/3 squared, and the 1/3 squared, every kind 
of quark being equally likely. That comes out 0.22. But 
Paschos has disregarded the fact that the quarks will pro- 
bably interact if there is going to be an infinite number of 
them. They are not going to be free, and there must be some- 
thing which carries momentum back and forth between the quarks 
through interactions. This would be equivalent, I believe, 
to having a neutral quark around. So the interactors would 


lower it still further, and maybe it is quarks that are in- 
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teracting. On the other hand, as long as the interactions 
are lowering it you can say that there is ordinary charge 
and an awful lot of interaction, such that there are indeed 
five neutrals for a single charged one. However, it is very 
interesting that these points of view are leading us into 
"Semi-difficulties", or if you like, indications that the 
quarks and the partons are related. 

One might ask: What is meant by the statement that the 
charge is an integer? It turns out that it can only be ex- 
pressed as a property of four currents. Two current opera- 
tors can make a commutator, but four currents can make com- 
mutators of commutators and things like that. The statement 
that the charge must be an integer on all the partons is a 
statement about a commutator with four currents in it. Ex- 
perimentally, one can get a four current commutator by mea- 
suring the inelastic Compton effect from a proton at very 
high energy if the parton view is right. This is because in 
the Compton effect the amplitude goes as the charge squared; 
and when one works this model, one will be weighing charge to 
the fourth power. So by having another number on the charge, 
we can compare the charges. For example, if all are unit 


charges, then the same integral, where the weight is the 
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charge to the fourth power, would give the same number. But, 
if the charges were less than a unit, then the integral for 
the Compton effect would give a smaller number. Of course, 
it is technically not quite feasible to do the Compton scat- 
tering. So I know of no direct way of testing it. 

An attempt has been made to apply these ideas to had- 
ronic scatterings, and I have made a number of speculations 
in print. For example, I can talk about the distribution of 
partons inside the hadrons and also about the distribution of 
hadrons which come out of a collision, and they are likely to 
be related but it is not obvious exactly how. One interesting 
feature, though, is this: Suppose for a moment that a proton 
moving to the right has a wave function to be partons and 
suppose that the exact wave function has virtually zero, or 
very low, amplitude to find any partons of very small x. That 
is, suppose all the partons are moving to the right and there 
is none standing still. Suppose also that another guy is 
moving the other way with exactly the same property. His par- 
tons are all moving to the left and nobody is nearly standing 
still. I am now talking about the wave function, not for the 
infinitely high speed, but for a very high speed where the 


limit has not quite been taken. Then each wave function is a 
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solution for the whole Hamiltonian. The right moving solu- 
tion involves only the right moving piece of the Hamiltonian, 
but it is a solution of the complete Hamiltonian. However, 
all the states in the right moving object are different from 
those in the left moving object, that is all the parton ex- 
citations are different. So if you will multiply these two 
wave functions together, it will still be a solution of the 
Hamiltonian equation. In short, there is no interaction! 
There will be an interaction if one particle in the 
right moving proton can be thought of as being in the left 
moving proton. When we say that the distribution is dx/x, 
where xW is equal to the momentum of the parton in the z 
direction and W is the total momentum, it is the same as say- 
ing dp, /P, as long as Py is large. But if x is of the order 
1/W, then Po is no longer large, and I could not have taken 
out the square root. It is true that it is dx/x as long as 
x is bigger than a small quantity 1/W; however, if x is of 
the order of 1/W, then the argument that the partons cannot 
go backwards and that for dx/x distribution begin to fail. 
So in this coordinate system, there is a certain amplitude 
for finding partons at rest and that amplitude is approxima- 


tely the integral of dx/x from 0 to 1/W. The integral ex- 
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actly is not of dx/x, but rather is of something like dk/w. 
It is a cutoff, it is a function that looks like dx/x for 
large x, and then it cuts off somewhere of the order 1/W. So 
the area under this cutoff is approximately the cutoff height 
W times the width 1/w. In other words, the integral below 
1/w is finite and independent of W. So a dx/x distribution 
of partons permits an interaction which leads to a cross sec- 
tion independent of the energy in the higher hadronic colli- 
sions. This may not be exactly right, there may be some er- 
rors, maybe a Slight variation logarithmically or something. 
If I suppose that the dx/x distribution extends also to 
the real particles which are produced in the collision, then 
the probability to have an exchange collision is the proba- 
bility that there are no emitted particles. If a proton 
comes in, say, and exchanges a charge to become a neutron, it 
is the probability that the neutron is quiet. If I suppose 
that the probability that there is an emitted particle is 


also distributed as dx/x up to order 1/E, then 


Prob = eS LnE, c= constant , 
1/E 


and the mean number of emitted particles is nve LnE, that 


is the mean number of pions and junk to be expected in an in- 
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elastic collision. But for charge exchange where no pions 
come out, the probability that none of them does come out is 
en ~ E°, ‘This is the reason why when you have nice ex- 
change reactions, where you ask for a specific exchange, 
things fall off with a power of the energy. They fall be- 
cause the probability that they do not shatter the product 
into a lot of tiny partons, each with a mean number logar- 
ithmic will fall. I also expect the multiplicity in high 
energy collisions to go logarithmically, which apparently it 
does, as the number slowly increases in cosmic ray experi- 
ments. 

In closing, the theoretical concept that I would like to 
emphasize is not so much the "partons", but that it is the 
wave functions of fast moving particles which might be useful 
for analyzing strong interactions, especially at high ener- 
gies. By the way, there is no approximation made here and 
that is the complete wave function, so you could deal with 
any energy. For a different reason I would like, on the 
other hand, to emphasize for further study the interesting 
results of the experiments on the deep inelastic cross- 
sections. There are quite a few numbers associated with them, 


as well as simple relations, from which we ought to be able 
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to learn something more, whether we do it by partons or by 


any other method. 


PARTONS 809 


DISCUSSION OF PROFESSOR FEYNMAN'S TALK 


JULIAN ROSENMAN (University of Texas): 


Can individual, free partons ever be seen? 

FEYNMAN: No, I don't believe so and, at any rate, I'm sup- 
posing not. Just like the proton cannot have any amplitude 
to be a pure parton, the pure parton would itself disinte- 
grate into real particles. This would be true, for example, 
for the parton kicked out by the photon in the deep inelastic 
scattering. There also might be some interaction with the 
remaining partons on the way out but I don't think that's as 
important as its disintegration. It could be that the par- 
ton's energy is higher than any of the fractions into which 
it can go. 

ROSENMAN: Doesn't this mean that partons must have an inte- 
gral charge? 

FEYNMAN: Yes. So one worries if partons are quark-anti- 
quark pairs, as Paschos suggests. What can we do, because 
certainly such a parton cannot be coming out? One of the 
ways that we can fix it is to assume that the interaction be- 
tween them, say, a harmonic force to produce the linear 
masses, SO when you pull a quark away, it snaps back by the 


harmonic force. Likewise, the kicked parton would be pulled 
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back and all mixed up with the other ones, and the net result 
would still be pions and protons, not quarks. But it seemed 
to me at first that this large force would obviate the argu- 
ment about the small energies. It doesn't. What matters is 
not the strength of the forces, but rather the spacing be- 
tween the energy levels. The large force only means it takes 
longer to come around, and I've already summed over all the 
states and that's independent of things because of complete- 
ness. In fact, using the harmonic oscillator potential, I 


analyzed this and found that this theory should be right as 


97+ (300) 


£( 2Mv 


) for a spacing between the levels of 300 MeV, as 
for the proton excitation spectrum. 

EDWARD M. MACKHOUSE (University of Texas): 

If the partons are attracted with a quadratic force, do you 
postulate some other force to keep the self-energies from 
blowing up, to avoid the divergences? 

FEYNMAN: What I postulate is that I can go as long as pos- 
sible so long as I evade the question! It's true that any 
field theory has its divergences and it is also true that the 
proposition that the transverse momenta are limited is not 
quite right for a field theory. If you do it right field- 


theoretically, when you integrate over the transverse momen- 
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ta, you get divergences, and so I think that there is some- 
thing limiting the transverse momenta in a way that I do not 
understand, which makes convergence a lot easier. In other 
words, you leave your formulas with integrations over trans- 
verse momenta and just interpret them as numbers without 
actually carrying out the integration. 

So I don’t know how to cure the field theory and, there- 
fore, I do not really propose that at the present time we 
should start with some specific field theory. I'm trying to 
meet both ways across the middle in order to have unitarity, 
and on the other hand to avoid divergences. 

ROBERT J. YAES (University of Texas): 

Can particles interact by the exchange of a single parton 
which would give an appropriate peaking in the crossed chan- 
nel? 

FEYNMAN: No! The two things interact due to the overlap be- 
tween the partons that have zero momentum, but that's not 

one parton; it's a characteristic of the system that the 

mean number of partons in there is infinite, not one. Per- 
turbation theory is wrong. Here, diagrams of one, two, ... 
interacting partons, all going across, are to be added toget- 


her to produce a resonance "gluck-glock" also known as a 
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Regge particle. 

ROSENMAN: Doesn't the fact that partons decay imply that 
they have structure? 

FEYNMAN: NO! Such a decay only has to do with the energies. 
It is possible theoretically to have, say, a structure of 
three much lighter than just one, so you can't produce only 
one, for it would disintegrate into two three's or into some 
other combination of complicated objects. But these under- 
lying objects must not be like quarks with a carrier quantum 
number of one-third, because then nothing can disintegrate. 
The quark picture is different, there you have to invent some- 
thing to hold them together if you don't see them. 

E. C. G. SUDARSHAN (University of Texas): 

Could it be that the partons are like the ether for light 
waves -- that the vibrations in the partons are like vibra- 
tions in the ether and may never be found? 

FEYNMAN: Yes, and in the case of “the ether never being 
found, it was ultimately realized that there wasn't any 

ether at all, the ether was one of these scaffoldings to 
create a theory. It was later realized that the ether was an 
irrelevant complication and it may be that the partons are 


also nonexistent. The partons were only put in there to have 
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a field theory but the world, in fact, is not field theory, 
rather it's something else. However, the world has unitarity, 
analyticity and relativistic invariance, ..., and so we use 
the field theory with partons. Thus, "this" and "that" is 
right but the real view is wrong, just like with the case of 


the ether. 
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LECTURE 1 


PARTONS, SCALING, AND REGGE BEHAVIOR 


The general outline of these lectures will be some- 
thing like this: first, I will describe the general 
ideas of the parton concept. Then I will point out 
how the experiments on deep-inelastic scattering of 
electrons and neutrinos by nucleons give us infor- 
mation about the nature and distribution of the 
partons. This will be followed by a discussion of 
the applications of parton ideas to hadronic colli- 
sions. I shall unfortunately not have time to 
discuss predictions of the products (the particles 
that come out) in deep-inelastic scattering. I will 
also discuss a number of theoretical ideas that have 
never been worked through to the end. 

It should be pointed out that the parton theory 
is a conglomeration of rather imprecise ideas. In 
particular, I want to emphasize that in hadronic 
collisions parton ideas have not been very effective. 
They did strongly suggest that there would be scating 
and a plateau in the rapidity plot, and these sug- 
gestions appear to be correct, but we haven't gotten 
anywhere with the detailed questions--proportions of 
t's and K's to be expected, and that sort of thing-- 
and that has been a disappointment to me. The idea 


of partons was originally conceived in an attempt to 
3 
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understand high-energy hadronic collisions, and I 
worked intensively on this in the summer of 1968. At 
that time I made a visit to SLAC and learned about 
the results on deep inelastic scattering of electrons. 
I saw that the experiments were tailor-made for 
investigating partons, and were easy to interpret in 
terms of the pictures | had already developed for the 
strong interactions. The deep-inelastic scattering 
experiments can identify the kinds of partons and how 
they are distributed, whereas, as yet, I see no clear 
way to do this from a study of hadronic collisions. 
The idea of partons is an old one--it is a field 
theory idea. In a field theory the physical state of 
a particle, say a proton, is described by a wave- 
function which gives the amplitudes for finding 
various configurations inside the proton. Configu- 
rations of what? If we consider quantum electrody- 
namics, the wavefunction for positronium contains 
some amplitude for finding an ideal Dirac electron- 
positron pair, an amplitude for finding those plus a 
virtual photon, etc. There is a certain amplitude-- 
small because the coupling is small--for finding two 
electrons and two positrons. Now the constituents 
whose distributions are given by these amplitudes are 
the objects created by the field operator w, which 
are not physical electrons. They are what we some- 


times call "bare" electrons. The physical electron 


4 


we find in the laboratory has some amplitude to be a 
bare electron, but also some small amplitude to be a 
bare electron plus virtual photons, additional pairs. 
etc. It ts unfortunate for our language of descrip- 
tion (although very fortunate for mathematical sim- 
plicity) that the bare and physical electrons are 
approximately the same. That is, the physical elec- 
tron has around 99% probability (5.99 in the ampli- 
tude) for being a bare electron and only a small 
probability for being more complicated. In strong 
interactions, the latter probability will not be 
small, and the physical particles and bare objects of 
the underlying field theory need not be so closely 
connected. So we introduce the word "parton" to 
refer to these bare constituents. The essumption is 
that we have an underlying field theory which contains 
operators of various types, which carry spin, isospin, 
strangeness, and so on. The partons are the objects 
created by these basic field operators. They are the 
quanta of the fields of the underlying field theory. 
In this picture, the wavefunction of a proton 
contains the amplitudes to find configurations of 
partons with various momenta. In principle, the 
wavefunction contains an amplitude Cy to find nothing 
at all (for a proton, Cy = 0 since the vacuum does 
not have the quantum numbers of a proton), an ampli- 
tude Cy, (0) to find one parton of type i (which index 
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specifies such things as spin, isospin, etc.) with 


momentum p, an amplitude Co 5 (Py Po) to find two 
partons of types i and j with momenta Py and Bo» and 


so on. Mathematically, we could write 


ly = Colo + f 435 c,, (Bat) 10> 
1 
+ DO [ 438,453, c,, (By .b,)a%(By)a*(B,)|O> + 
: Py® Po Fogg eh eh ar sy it P p72 B GPS eas 


where |0> represents the vacuum, and a*(p) is the 
creation Operator for a parton of type i and momen- 
tum p. We will ignore all complications of field 
theory which may make the rigorous basis of such an 
expression suspect. 

If such an expression does make some sort of 
sense, there will be a wavefunction of this sort for 
a proton in any circumstance: at rest, moving along 
the z-axis with a certain momentum, and so on. For 
collisions at high energy,we need the wavefunction 
for a particle with a large momentum. This is not 
easily obtained from the wavefunction for the par- 
ticle at rest, however. The wavefunction is not 
relativistically invariant, and the transformation 
which connects one Lorentz frame with another involves 
the Hamiltonian, which we do not know--it is our hope 
that we can understand many features of high-energy 
collisions without a detailed knowledge of the Hamil- 


tonian. Thus we are led to try directly to guess or 
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understand the wavefunction of a rapidly moving 
particle. 

One of the first features we come across is the 
scaling behavior of the wavefunction. Let us consi- 
der a proton, say, of large total momentum P, and 
consider the amplitude that this proton consists of 
just two partons with momenta Py and Bo- We write 
the longitudinal momentum of one parton as x7P and of 
the other as XoP. Then we have, of course, Xy +X» F 


1. The partons will have transverse mmenta 0, and 


1 
q,. In that part of the wavefunction where Q, and 
Q, are finite (and remain finite as P + ~), the 
amplitude for finding Xp> Xoo 015 and 0, is indepen- 
dent of P in the high P limit. If, in fact, there is 
no substantial amplitude for finding partons with 
large transverse momenta, then the wavefunction itself 
may be regarded as scaling in this way. That this 
may be true was suggested to me by the observed 
behavior in high-energy collisions, where we find the 
products limited to about 300 MeV/c in transverse 
momentum. This makes it reasonable to suppose that 
the wavefunction does not contain partons with large 
transverse momentum. 

The simplest elementary argument to support the 
general notion of scaling of the wavefunction if the 
transverse momenta are limited is the following. 


Suppose the wavefunction refers to a particle with 
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total energy E. In perturbation theory, the ampli- 
tude for finding this as two subsystems of energies 
Ey and ES will be proportional to an energy denomi- 
nator factor 

A/(E-E,-E,) 
Let P be the longitudinal momentum of the particle, 
and Qy its transverse momentum. The partons are 
assigned longitudinal momenta X7P and XoP and trans- 
verse momenta a, and gs We are interested in the 
limit of very large P, with the Q's fixed. Since 
conservation of momentum requires xy + Xo = 1, we can 
rewrite the energy denominator in the form 


(E=P) = (Eyex P) = (E,-x P) . 


2 
Letting m be the particle mass, and my and mo the 


parton masses, we have 


E = Vn2+P+q2 


E. = Pneex2p 24g? : 


1 


and 


When P is very large, we can approximate these 
expressions by 
~u 


EY ps (m?+Q¢/2P) 


and 


He 


E 


2.n2 
F x,P + (m5 +Q5/2x,P) 


Inserting these expressions in the energy 


denominator, the amplitude becomes 
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There will also be normalization factors like 1//2u, 
etc. In general, as P + », these factors multiplied 
by the P in the numerator will either go to zero or 
approach a non-zero constant. The limiting amplitude 
thus depends only on the x‘s and the transverse 
momenta. Note that in the approximations used for 
the energy we have assumed the x's are not too small; 
clearly, the argument breaks down for x's less than 
order 1/P. This is what we will call the "wee" 
region, and we will find that in the wee region of x 
there are a large mumber of partons, which compli- 
cate the interpretation of the wavefunction. We 
also note that there is negligible amplitude to find 
partons with negative (and non-wee) values of x. For 
if x; is negative, E, is |x, |P and E, - x,P is 
2|x,|P. not of order 1/P as before but of order P. 

So the large denominator as P + » makes the amplitude 
negligible for finite negative x. 

I am going to interrupt this discussion at this 
point to discuss one of the ideas I mentioned earlier 
that only partially worked. This concerns the ideas 
about high-energy collisions that come from the Regge 
pole approach. We find that various cross sections 


fall as inverse powers of s, the square of the total 


9 


568 


center-of-mass energy, in such a manner that the 
powers depend on the quantum numbers exchanged but 
not on the precise process involved. For example, in 
a collision that involves ao exchange, the cross 
section as a function of s at fixed momentum transfer 
t determines a power of s, a(t). This power a(t) is 
(supposedly) the same as found in any other collision 
involving p exchange. This is analogous to the case 
of resonances: the position of the resonance is 
independent of the process in which it is formed. 

Now suppose I gave you the strong interaction Hamil- 
tonian and asked you to compute the various powers 
that we find in Regge theory. You could simply use 
the Hamiltonian to work out the details of a parti- 


© + att, and extract the 


cular process, Say nr + pert 
powers from the high-energy behavior of the cross 
section. But that would be something like getting 
the mass of the a? from the same process by com- 


0 and 


puting all the details of m +p*ptomi¢m 
looking for the peak in the pr mass distribution. 
That isn't the way we calculate energy levels. We 
expect, instead, to be able to solve an eigenvalue 
equation, Hy = Ey. In other words, we use the 
Hamiltonian directly to find an eigenvalue which does 
not depend on a specific process. The question is 
whether or not the Regge parameters can also be found 


in some such deeper way. 
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Suppose we have a scattering process A+ B+C + 


D, which we represent schematically by a diagram: 


A C 


D B 

This is supposed to indicate that something (a "Reg- 
geon?") is exchanged in the process. From our point 
of view, some “pieces” of the fast moving particle A 
fall off and are absorbed by B. Now in the center- 
of-mass system, as the momentum P of A and B goes to 
infinity, both the longitudinal momentum and the 
energy that are transferred by the exchange will be 
of order 1/P, and so we neglect them in high-energy 
collisions. The pieces of A that are emitted are 
those with momentum not too far from zero. Let us 
take F(E) as the amplitude for A emitting this 
"stuff" when A has energy E in the center-of-mass. 
We let B be the amplitude for propagating it to the 
other particle, and G(E) be the amplitude for B's 
absorbing it. The overall amplitude is 

F(E)BG(E) . (1) 
Now we suppose that the amplitude F(E) that A emits 
this stuff does not depend on what particle B absorbs 
it and how it is done. That is, the amplitude above 


factors: F(E) does not depend on B. 
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Now this time suppose particle A has energy Ey 
and it emits the “thing” with amplitude F(E,), but 
now particle B has a different energy, say Eos 
(moving, of course, in the opposite direction to A) 
so the proper absorption amplitude is G(E,), and our 
amplitude is 

F(E,)BG(E,) . (2) 
But relativity tells us that this must not depend on 
the coordinate system, and hence can only depend on 
the invariant E,E, (s = 4E,E,). For example, we may 
have transformed by velocity v from the center-of- 
mass system, so A's energy is increased by a factor 
f = V(itv)/(1-v), and B's decreased by the same 
factor: E, = fE, E, = eles and (2) must not depend 
on the factor f. If (2) is to depend only on the 
product E,E,, then F(E,) and G(E,) each must be 
proportional to a power of E, the same for each: 

F(E) ~ E*, G(E) ~ E% 

and the overall amplitude goes as s*. (One might 
object that the “stuff" being exchanged in the system 
at velocity v is not the same as that in the center- 
of-mass, so 8B depends on f. But the stuff has zero 
energy and longitudinal momentum, so when transformed 
it is the same.) We are led in this way to associate 
Regge behavior with the exchange of something with 
very low momentum. This involves the probability for 


finding slow partons in a wavefunction for a particle 
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with large momentum, which in principle can be cal- 
culated directly given the Hamiltonian. 

In order to see how this might be done, I 
analyzed the situation as follows. Let F be some 
large energy scale, and consider a wavefunction for 
a particle with longitudinal momentum Be BPs (We 


6) 
are going to keep Xo fixed and let F+o. x is 


9) 
arbitrary, of course; you may choose it to be 1.) 
This wavefunction is a solution of Hw = Ew, where H 
is the (unknown) Hamiltonian of the system. yw is 
also an eigenfunction of the operator for longitudi- 
nal momentum, Pos that is, Pow = Xo Fu. We will 
assume that XGF is very large; then the energy E is 
nearly equal to XoFs and using the same approximation 
as earlier we have 
= Zn2 
E = xoF + (m +Q0/2x5F) - 
with 4, being the transverse momentum. Now we can 
write 
2F(H-P_)y = (m@+Q2/2x_)y 
z ) 9) 
If we define an operator 
Wo= Lim 2F(H=P) 
From , 
then py at infinite momentum is a solution of WW = wy, 
where the eigenvalue is 
‘. 2.2 
w= (m'+Q5)/2x, 
Applying this idea with various model Hamiltonians 


suggest W operators,which depend only on x's and Q's, 
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which is how I arrived at the scaling idea for the 
wavefunction yw. 
Let's take a very easy example--the scalar ie 
theory, described by the Lagrangian 
> de = (74)? = p24? + go : 
The Hamiltonian for this theory has the form 


H =X w(k)a*(k) a(k) 


563) (e+e +k) 
9 Ww WwW Ww 
ky kok, ¥ iy” 2y° 3 


x [a(k,)+a*(-k,) La(k)ta*(-k,)] , 


[a(k,)+a*(-k,)] 


where a*(k) is the creation operator for a scalar 
particle of momentum k, w(k) = Veer, and 
wy = w(k;). The operator for longitudinal momentum 
is 
p “Z k a*(k)a(k) 
Now we want to take k, = xF and study the limit 
F+o. If x is positive, we have w > kK. in this 
limit and so we can use the usual approximation 

w= k, ¥ (u24Q2)/2xF . 
For x negative, however, w approaches oh and so 

w- k, = 2fk | = 2]x{F . 
Now we see that if we write crCHeP in terms of the 
x's and transverse momenta, we get 

2.72 > > > > 

HD ak (x,Q)a(x.g) + DO 4aFelxl a*(x,Q)a(x,d) 
x>0 x<0 
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See 


for the kinetic energy contribution to the operator. 
The second term here looks like the largest part, but 
it is really unimportant except in the following 
sense. If we start with a wavefunction containing 
positive x's, and in a perturbation treatment some 
Operator generates something with a negative x, this 
term will give an enormous contribution to the deno- 
minator in the 1/(H~E) factor in the perturbation 
expansion. So the overall effect of this term is 
simply to kill all negative x's--to get rid of things 
moving backward. 

The operators a*(x,0) introduced in this 
expression are not the same as the original creation 
operators a*(k). The latter obey commutation rela- 
tions 

Ca(k,). at(k,)] = 389) ck,-k), 
whereas we want the former to obey 
pe * a (2), in 

Ca(x,.Q,). a%(x5,05)] = 6(xy-x_) 69°" (Q)-Q5) 
Since §(k,4-k >) =F! s(x )-x5), this implies we must 
introduce a factor VF in the definition of the new 
operators; that is, 

a*(k) + Fo 4a*(x,Q) 
On the other hand, the sum over longitudinal momenta 
will be F times a sum over x, so that we also have 


the reptacement 


2 ED 


k x,Q 
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We see that the scale factors cancel in the terms 
involving Za*a. When we consider the interaction 
term, converting the sums over ky> kos and k to sums 
over the x's and transverse momenta introduces a 
factor e, while transforming the a‘s gives gp 3/2 
The factors 1/VY2w = 1//2|x, TF give another pr 3/2 and 
converting the 6-function of longitudinal momentum to 
a 6-function of the x's brings in 1/F. Overatl, we 
have just the factor 1/F left, which is canceled by 
the F in 2F(H-P_). Putting this all together, the 
interaction contribution is 

(2) (3 4840 
» ames ar ae Le A 


x. .Q 


(g/V2) 
i Me aguas 
x La(xyoQ,)#a"(-x,5-G,) Ia(x,.G5)#a*(-x5 5-05) ] 
x [a(x3,03)+a*(-x3,-0,)] - 
When we compute the limit F + » to obtain the 
operator W, we can make use of the fact that negative 
x's in the wavefunction are killed by the are term 
discussed earlier. Therefore, the interaction terms 
with three creation or three annihilation operators 
can be discarded, since the 6-function requires the 
sum of the x's to be zero. Thus we are finally led 
to the effective W operator: 
2,n2 = rg 
W = , BQ” a*(x,Q)a(x,Q) 


Xx 
x>0 
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ro: 
x1 %2%4 0, 
XX _>0 
> > > + 
: [a(x ,+Q,)a(xosQq)a*(x,t+x5,Q)4#Q,) + hece] 
py tar aoe Welar 

We can see how scaling is implied by this limit, 
since the wavefunction is an eigenfunction of W, 
which has no F dependence, with eigenvalue w, which 
is also independent of F. (Strictly speaking, W is 
singular near x = 0 and in general cases has no 
strictly scaling p solutions--the complications this 
produces in the wee region will be discussed later.) 

Now we notice that this equation is homogeneous 
of degree -] in x. It is not obvious that W has this 
property, because of the square roots, but it follows 
from the definition of the a's. This implies that 
under x > Xx, 

a*(x.0) > 97 %a*(ax,0) , 

since this is equivalent to changing the energy scale 
via F + F/A. It is then easy to verify that if we 
replace x by Ax, W > (1/A)W. (That this must be true 
in general can be seen since the eigenvalues for 
different Xo vary as 1/x,-) The operators that 
change the scale of the x's are just the boosts along 
the longitudinal direction. The infinitesimal gen- 
erator of these boosts may be written 


ee 
Bos 2 xy Oo/axe) 


17 


575 


576 


and the fact that W is homogeneous in x-! implies 
the commutation relation 
[W,B] = WB - BW = W 

We would like to extract from our knowledge of 
W something about the small (but finite) x region. 
Suppose we started with an approximate wavefunction 
containing one fast moving parton. In the next 
approximation, the interaction would produce some 
partons moving Slower. These, in turn, would give 
rise to some more moving slower still, and so on. 
The slow partons would result in this way from a long 
cascade. Now we come across cascades like this in 
the theory of cosmic ray showers, and the general 
qualitative feature of such showers is that when 
there are a large number of steps in the cascade, we 
ultimately produce a distribution with power law 
behavior in the fraction x of the original energy or 
momentum, which is independent of how the shower 
started. In shower theory, there are many different 
eigenvalues for the power, which do not depend on the 
starting point, although the amount of each power 
present does. If we concentrate on the lowest power, 
the one that is most effective as the shower becomes 
deeper, the shape of the distribution becomes inde- 
pendent of how the shower started. 

How can we make this qualitative idea more 


explicit? Looking at small x's is equivalent to 
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keeping x fixed and letting x. become large. If we 


) 
let X5 90 to infinity, the eigenvalue w goes to zero, 
and the wavefunction becomes a solution of Wy = 0. 
In this case it is possible to choose w to be simul- 
taneously an eigenfunction of B (which is not possi- 
ble for w # 0 since B and W do not commute); that is, 
we take By = by, where b is some number. But this 
means the wavefunction must have the property 

OC spas P EAP AR aee es) 
If we ask for the probability for finding particles 
at x in range dx, we could calculate this by scaling 
down the wavefunction used in a calculation of the 
probability for some other value of x. Thus the 
probability will have to depend on some power of x 
related to this eigenvalue b (the relationship is 
determined by how the wavefunction is used to compute 
the probability). In any case, we will get a dis- 
tribution of the general form xB dx/x. This will be 
valid for small x down to the wee region; i.e., down 
to x of order 1/E. The smooth joining of the wee 
region onto this distribution implies that this 
region contributes a total probability of order 
LivEN which is the sort of thing we were looking 
for to solve the problem originally proposed. 
Namely, to find the powers that show up in the Regge 
theory from a knowledge of the Hamiltonian, we 


could construct the limiting operator W and find 
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simultaneous eigenfunctions of W with eigenvalue 0 
and of B. The eigenvalues of B would then determine 
the powers in question. In fact, there are some 
technical difficulties with this idea. In particular, 
to get simultaneous eigenfunctions of W and B we 

must let XG become infinite. But this is not com- 
patible with the usual boundary conditions for the 
wavefunction whereby we must start somewhere with a 
finite number of particles. At best, we have here an 
outline of a theory that might lead us from the 
Hamiltonian to a theory of Regge poles. 

The main point we want to remember from this 
discussion is that it is reasonable to expect a 
distribution at low x that goes like a power of x. 
The value of the power may depend on the quantum 
numbers being considered, so that exchanges involving 
different quantum numbers can have different power 
law dependence on 1/E. If total cross sections go 
to constants as E + ~, then the lowest allowed value 
of B is zero. Unfortunately, it is not at all clear 
how this particular value follows in any simple way 


from the theory. 
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LECTURE 2 
PARTON DISTRIBUTIONS AND DEEP-INELASTIC SCATTERING 


In the first lecture, we tried to describe the char- 
acter of the wavefunction for a proton moving with a 
very high momentum, and came to the conclusion that 
the amplitude to find a parton with a certain momen- 
tum inside Should depend only on the fraction, x, of 
the proton's longitudinal momentum carried by the 
parton, and on its transverse momentum. We also dis- 
cussed how the distribution of partons might be 
expected to behave in the region of small x, and 
decided by analogy with shower theory that it was 
reasonable to expect power law behavior; that is, a 
distribution consisting of a superposition of terms 
like xPdx/x. In the early development of these 
ideas, I came to the conclusion that it was most 
likely that the lowest eigenvalue for 8 would be 
zero. There were two sorts of arguments which indi- 
cated this. One was based on an analogy with brems- 
strahlung: if you compute the distribution of 
photons in the field of a fast moving electron, you 
get this sort of distribution. The other argument 
comes from the notion outlined last time that it is 
the wee partons that are involved in high-energy 


hadronic collisions, and B=0 is required to give 
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total cross sections that are constant at high 
energy. To get a constant cross section the mean 
number of wee partons must then be constant, inde- 
pendent of P. This is because the wee region has a 
width of order 1/P (the distribution must fall rapid- 
ly to zero for x < 0), and it must match onto the 
small x distribution in the neighborhood of x = 1/P. 
For a dx/x distribution, this implies a height of 
order P, and a total number of partons in the wee 
region that is constant, independent of P. If we go 
to higher momentum P, the x distribution remains the 
same, according to scaling, except that now we can 
use it down to smaller values of x, so it climbs to 
a higher value before turning over. If we had a 
small x distribution like x®dx/x, the value at the 
peak would be of order ciypy8o!, giving an area in 
the wee region of order (1/P)®. 

The operator W we discussed in the last lecture 
suggests the power law, which we take in the present 
case to be dx/x, but it does not tell us how things 
vary in the wee region (where the approximation 
w = k, + (12+92)/2k, is inadequate). To get some 
rough idea of how a wavefunction could describe such 
a region, we look at a simple example of a model 
which is so simple that we can see the effect of our 


approximation of w. It is a model with scalar par- 


ticles coupled to a c-number source, described by 
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the Hermitian operator 
H = 5f(K)a*(k)a(k) + ZEs(k)a*(k) + s*(k)a(k)], 
k k 


where s(k) is just a numerical valued function, and 
a*(k) is the creation operator for a scalar field. 


It is easy to verify that 
|¥> = exp{sc(k)a*(k)}|0> 
k 


is an eigenvector of H if we choose c(k) = -s(k)/#(K). 
If, as in the example considered last time, we want 

to look at the difference between the Hamiltonian and 
the longitudinal momentum operator, we must take 

f(k) =w- k,. Also, to get scaling behavior, the 
p7 3/2 


function s(k) must scale as » Where F is the 


energy scale. To get the desired dx/x behavior, we 


-3/2 


choose it to be proportional to w » yiving 


c(k) = a/ (wk, )w*/? 


> 


The mean number of particles with momentum k is 
found by computing the expectation value of a*(k)a(k), 
which is just Ic(k)|*. Thus the distribution 
function for particles with transverse momentum 0 


+ 
and longitudinal momentum ky in range dk 4°Q is 


2 


P dk 47G/(w-k ,)2w9 


Now suppose we have ky = xP and consider P very 


large. For x > 0; 
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wk > (g2 + m?)/2xP 
and 


w > xP , 
so the distribution becomes 


Pdx(2xP/Q2+m2)-d2d/x2P> = [4a2/(Q2+m2)?]420(dx/x). 
This has just the dx/x behavior we have been talking 
about. (With a 6 dependent coefficient, but since 
a could depend on O,we know nothing about it.) For 


HS Oy wee Ke 2|x|P, so the distribution becomes 


(02 /4P*) (dx/x°) ’ 


which falls rapidly to zero as P + o. Of course, a 
model like this doesn't prove anything, but it does 
give us some mathematical expressions to manipulate. 
One of the great troubles of the parton model is 
that there isn't any mathematical expression, even 


to play with, that is of any rigor. 


The wee region complicates the wavefunction very 
much. It is a purely technical point, but it is 
incorrect to speak of the wavefunction scaling. The 
wavefunction contains the amplitudes for all kinds 
of configurations, including partons in the wee 


region, and these pieces of the wavefunction do not 


scale. It is the density matrix that exhibits 
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scaling--the probability for finding a particular 
value of longitudinal momentum not in the wee region 
scales, for instance. The probability for finding 
four partons with various x's and Q's will approach 
a limit as P + », if you ignore where the other par- 
tons are. But, of course, you cannot ignore where 
the “other" partons are when talking about a wave- 
function, for a wavefunction is necessarily a 
function of all the particles present, and not just 
a few selected ones, as a density function may be. 
This is analogous to another situation in physics 
which has technically the same sort of difficulty. 
Consider a one-dimensional liquid which has a 
“surface” at one end that is affected by some kind of 
surface tension forces, and a surface at the other 
end with some (perhaps different) forces acting. If 
the liquid jis thick enough--the distance between the 
two surfaces large enough--what happens at one end 
will not affect what happens at the other. But this 
independence is not apparent in the wavefunction, 
which must give the amplitude for every configuration 
of all the particles in the liquid. However, if we 
compute a density matrix for the probability, say, 
of finding two particles at two specific locations 
in the liquid, the independence will show up in a 
simple way--the probability will become a product of 


independent single particle probability distributions 
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when the separation is large enough. 

A variable like the logarithm of x is useful to 
make a closer correspondence with the theory of a 
liquid. It is the quantity called rapidity, defined 
by 


y = (1/2)log[(wtk,)/(w-k,)]. 
In the large P limit, 

(wtk,)/(w-k,) ¥ 4P2x27/(Q24m?) , 
so that 


y v~ log x + log P. 


This will be valid even for small x, so long as x is 
larger than order 1/P (not wee). Since dy = dk /w > 
dx/x,we see that for y in the region corresponding to 
small x, the assumed dx/x distribution implies a 
flat "plateau" in the rapidity plot. The rapidity 
is related to velocity via 

v/c = tanh y. 
This has the useful property that a Lorentz trans- 
formation along the longitudinal direction simply 
changes y by an additive constant. The distribution 
at the ends of the plot (x near 1 and x wee) may be 
complicated, but our analogy with shower theory 
suggests the two ends are independent, like the two 


surfaces of the one-dimensional liquid (see figure). 
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y 
plateau region finite x 
y near log P 
wee xX 
finite y 


Density to find a parton vs. rapidity 


We can ask what the probability is of finding 
no partons in the small x region--that is, we look 
for configurations that have many wee partons, and 
partons with x greater than some Xo (where xX, Stays 
finite as P goes to infinity), but none in-between. 
Our picture of how the smal? x region is built up by 
a long cascade process leads us to conjecture that 
the probability of finding no particles in some 
region should be proportional to exp(-cn), where n is 
the mean number of particles in that region. The 
constant c would be 1 for a Poisson distribution, but 
forces between partons may make it easier to make 
one parton when another is present, etc., so that 
successive creations are not statistically indepen- 
dent. We could have some sort of Markovian chain in 


the plateau region in which some number of partons 
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makes up the statistically independent unit--and so 
we introduce the constant c to represent these devia- 
tions from Poisson distribution behavior. In the 
small x region, the distribution as a function of 
rapidity is flat, so the mean number jin a gap of 
width Ay will be proportional to Ay. For large P we 
see that Ay ¥ log Xo + log P, and so it increases as 
the logarithm of the momentum. This means the 
probability of finding no partons in the gap will 
vary with momentum as exp(-c logP) = P °. We will 
consider a number of applications of this basic 
principle later. The value of c probably depends on 
the quantum numbers "carried across the gap"; that 
is, on the way in which we apportion the quantum 
numbers of the wavefunction among the partons on the 
two sides of the gap. 

We have a picture at this point in which we have 
a reasonable guess as to the average distribution of 
partons, and we have an idea that the probability of 
finding no partons in some large range Ay of y will 
behave exponentially with Ay as exp(-cAy). It is a 
naive one-dimensional picture--we are missing infor- 
mation about transverse momentum distributions. 
Instead,we simply guess from experimental consider- 
ations that the probability of finding partons of 
high-transverse momentum is small and falls rapidly 


with increasing transverse momentum. This does 
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indicate that parton-parton interactions are signifi- 
cant only among partons of low-relative momentum. In 
other words, there can't be a big amplitude that two 
partons interact to produce a large change in their 
relative momentum, since this would lead to large 
transverse momentum partons. Then it is just as true 
that there will not be an exchange of a large amount 
of longitudinal momentum. Basically, the argument 
implies that interactions occur only between partons 
separated by a finite amount of rapidity. 

So far, we have not even said what the partons 
might be like--spin 1/2, spin 0, charged, neutral...? 
Unfortunately, we are as yet unable to deduce any- 
thing from high-energy hadron collisions (except that 
the original idea of scaling in longitudinal momentum 
works). On the other hand, I discovered to my great 
glee that the SLAC experiments on deep-inelastic 
scattering were looking directly at the parton pro- 
bability distributions, and that Bjorken scaling was 
exactly the same as my conclusion that the wave- 
function scaled. Furthermore, the parton interpre- 
tation of these experiments gave a dx/x distribution, 
which confirmed my opinion which I was chen forming 
on this point (it wasn't really certain enough in my 
mind to call it a prediction). 

Let us see how the deep-inelastic scattering 


experiments look in the parton-model interpretation. 
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We have a proton with a large momentum P, which we 
regard as made up of a large number of partons carry- 
ing various fractions, €, of the total longitudinal 
momentum. (We temporarily use & instead of x for 
this quantity.) The electron produces a virtual 


photon which interacts with one of the partons: 


virtual 
proton photon electron 


This interaction involves a simple local coupling, 
consistent with the whole approach of expanding the 
proton's wavefunction in terms of the fundamental 
pointlike bare objects, each having the usual 
"minimal" coupling to electrodynamics (putting V > 
V-eA). The virtual photon carries energy and momen- 
tum; we can arrange to use a coordinate system in 


which its energy is zero, and write its 4-momentum: 
q = (0, 0, 0, -2Px). 
In this system, the proton's 4-momentum is 
po (P, 0) 0,2), 


assuming that P is large enough that we can neglect 


the proton mass. Thus we have 
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and 
p:q = 2pex. 
The last of these is usually called Mv, and the idea 
of Bjorken scaling is that the high-energy cross 
section should depend only on the dimensionless ratio 
~q? /2Mv = xX. 
In the parton picture the scattering process 


looks like 


Before: 


MMV 


—P 2xP 
After: 


(2x=€)P 

The energy of the hadron system before and after the 
collision is the same in this frame, since the photon 
brings in no energy. Now, the idea discussed earlier 
that the parton-parton interactions are limited to 

a finite range of rapidity means that these inter- 
actions contribute only a finite amount to the energy 
of the state. (The best mathematical way to state 
the principle is to say the unknown interactions have 
the same effects in a problem as finite uncertainties 


in parton masses would have.) The parton kinetic 
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energies, however, are of order P, as is the total 
energy of the proton. To order P, then, the energy 
of the hadrons in the final state (as well as in the 
initial proton state) is just the sum of the parton 
kinetic energies. In the example we are considering, 
conservation of energy (to this order) requires 

2x-& = €&, or €& = x. So the experiment directly 
samples partons with various values of xX. 

In these scattering processes, the photon is 
virtual and there are longitudinal as well as trans- 
verse components to the photon's polarization. The 
proportion of these varies with the electron's 
angle, 8, and so the cross sections for scattering 
of transverse photons, Tye and longitudinal photons, 
Jes can each, in principle, be separately measured. 
If the partons had spin 1/2, then at sufficiently 
large P (such that parton masses can be neglected), 
helicity will be conserved in the scattering, as 
we know from the theory of photon coupling. A 
parton whose direction was reversed as in the figure 
would have to absorb a unit of angular momentum in 
order to maintain the same helicity, and hence would 
contribute only to Ty: Similarly, a spin O parton 
would contribute only to 6: The data are consistent 
with the assumption that all the charged partons 
have spin 1/2. That is, Og /04 is small. The 
early data indicated that oo/o,4 could be of the 
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order of 18+10%, so this fraction could be spin zero. 
A recent thesis by Riordan contains a more careful 
analysis of the possibility that all the charged 
partons have spin 1/2. This means that cne residual 
o, is due to the finiteness of P, and consequently 
that helicity is not exactly conserved. In this case, 


Oo, must vary aS 1l/v3 in particular, 
R= og/o, = (V/v)#(x) = (1/q2)g(x). 


The data fit this hypothesis quite well, if, for 
example,g(x) is roughly constant. 
In these experiments we are measuring f,(x)--the 
mean number of partons with a given value of x, 
weighted by the square of the charge. The square of 
the charge comes in, of course, for this is the 
probability a given parton interacts with the photon. 
Neutral partons would not contribute to the scatter- 
ing f,. The function Fo (x) = xf (x) tells us how 
much momentum is being carried by the partons at 
x (with the same weighting). This fs the function 
usually called vW5, and the experiments indicate 
that it becomes constant at small x, consistent with 
a 1/x behavior of the basic distribution Fy (x). 
Another interesting region is near x = 1, as 
pointed out by Drell and Yan. Let us ask for the 
probability of finding a parton with x near 1; or 


equivalently, with x' = 1-x small. This means that 
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one parton will carry nearly all the proton's momen- 
tum, with the rest sharing a total of x'P. We will 
therefore have a gap in the rapidity plot from y 
about log P + log x' up to y near log P. The size of 
the gap is thus -log x', independent of P, so we 
still have scaling. Using the basic principle out- 
lined earlier, we expect this configuration to occur 
with probability 

exp(-cAy) = exp(c log x') = (1ex)°. 
The distribution near x = 1 should behave as the 
differential of this; i.e., like (1-x)°7!dx. Now we 
can connect this exponent to another experimental 
quantity. Consider elastic scattering with large q@ 


We cannot have a picture like 


Before: VAAN AS 


After: 


since in an elastic event, the final particle will 
still be a proton, and will not have the backward 
moving partons shown in the "after" picture. That 
is, the "after" picture has zero amplitude to be 
found in a proton moving to the left. So to get 


elastic scattering, we must require the photon to 
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interact with a configuration containing just one 
fast-moving parton, with all the rest in the wee 


region: 


<//VIVY 


Before: 


After: 


Only in this case will there be a reasonable ampli- 
tude to find each configuration in a proton. Thus 
the probability of an elastic scattering event will 
be proportional to the probability of finding such a 
configuration, and this configuration has a gap in 
rapidity Ay of order 10g P. This should go like 
exp(-c logP) = PS, 
with c the same number as found from the parton 
distribution function near x = 1. If the elastic 


y2 


> 


form factor really has the famous dipole form (1/42 
this would predict the distribution function near 

x = 1 to vary as (1-x)?. Unfortunately, neither the 
elastic nor deep-inelastic data is sufficiently 
accurate to confirm this connection; but there is 


no inconsistency. 
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LECTURE 3 
PARTONS AS QUARKS 


Up until now we have been talking about the possi- 
bility that the wavefunction for a rapidly-moving 
particle consists of amplitudes for fundamental 
constituents, which we call partons, having limited 
transverse momentum and which scale in longitudinal 
momentum. If this picture is correct, then the 
determination of the characteristics of the partons 
becomes a really fundamental question in the theory 
of hadrons. In the last lecture we discovered that 
the experiments on deep-inelastic scattering of 
electrons by nucleons indicate that it is possible 
all the charged partons have spin 1/2. In this 
lecture we will investigate a specific hypothesis-- 
that the charged partons have the quantum numbers 
of quarks--and see what the data have to say about 
this possibility. Identifying partons with quarks 
tells us that we have three kinds of fundamental 


constituents with quantum numbers summarized by 


the table: 
Symbol Charge 13 Strangeness Name 
u 2/3 1/2 0 “up” 
d -1/3 -1/2 0 “down" 
s -1/3 0 -1 “strange” 
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In the simple non-relativistic quark model, 
the proton is supposed to be made up of three quarks-- 
two u's and ad. The parton picture, which is 
relativistic, doesn't allow a fixed number of con- 
stituents (which are bare objects, and not physical 
particles or quasi-particles anyway). So the "partons 
as quarks" are not quite the same thing as the quarks 
in the non-relativistic model. 

With three basic types of partons (and their 
anti-particle counterparts), the distribution func- 
tions for the partons in protons and neutrons will 
be constructed from six functions describing the 
distributions of the specific types. In particular, 
the function £50 (4) measured in deep-inelastic scat- 


tering of electrons by protons, will have the form 
#5P(x) = (4/9) xLu(x) + U(x)] 
+ (1/9)x[d(x) + d(x)] + (1/9)x[s(x) + s(x)] 


In this expression, the function u(x) is the proba- 
bility distribution of finding an up quark in the 
proton with fraction of longitudinal momentum x, 
while u(x) is the distribution function for the 
u-type anti-quark. The factors 4/9 and 1/9 are 

the squares of the charges of the appropriate 
quarks. We can write a similar expression for 

the neutron; but by isospin symmetry, the number 


of up quarks in the neutron is the same as the 
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number of down quarks in the proton, so we can write 


£§"(x) = (1/9)x[u(x) + U(x)] 


+ (4/9)x[d(x) + d(x)] + (1/9)xEs(x) + s(x)]. 
The distribution functions u(x), etc., continue 
to be defined as above; i.e., they refer to the 
distributions in the proton. 

So far we have two experimentally measurable 
functions expressed in terms of six unknown dis- 
tributions; but there are some theoretical argu- 
ments which we can make from our theory of the 
wavefunctions. For example, the total strangeness 


due to strange quarks will be 


f' sore, 


and that due to the anti-particles will be 


] 
“f $(x)dx. 


Adding these gives the total proton strangeness, 


which is zero, of course, and we get the sum rule: 
] - 
f (s(x) - §(x)]Jax = 0. 
0 


This also means that the s and § quarks will con- 
tribute net zero charge to the proton, so we can 
write a sum rule for the charge in terms of u(x), 


u(x), d(x), and d(x), as follows: 


1 1 
(2/3) f Cu(x)-G(x)Jdx - 1/3) f [a(x)-a(x) Jax = 1. 
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Similarly, for the total 13; we get: 


1 1 
ara [u(x)-u(x)]dx - (172) f [d(x)-d(x)]dx = 1/2 


We can combine the last two sum rules to obtain the 


simpler expressions 


] 
f [u(x) - u(x) Jax 
0 


] 
f [d(x) - d(x) ]dx 
0 


Note the connection with the usual non-relativistic 


ff 
Nh 


and 


i 
_ 


quark model: the net number of up quarks is 2, the 
net number of down quarks is 1, and the net number 
of strange quarks is 0. 

We also believe from our theory that the wee 
quark region should be independent of whether the 
wavefunction describes a proton, a neutron, an 
anti-proton, or an anti-neutron. OQur picture is 
that this region is built up from a long cascade 
and doesn't depend on whether we started with an 
up or down quark or anti-quark. So we believe 


that for x small enough, we should have 


u(x) = U(x) = d(x) = a(x) ¥ c/x (small x) 
Since SU, is not an exact symmetry, it is possible 
that the wee region distribution of strange quarks 


is somewhat different, and we only require that 


s(x) = §(x) ¥ c'/x (small x) 
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If we look at the ratio fo ite 3 and note that 
the distributions are all positive functions, we 
see that it must lie between 1/4 and 4. Near x = 0, 
the theoretical argument about the wee region implies 
the ratio goes to 1, in agreement with experiment. 
As x approaches 1, the experimental value of the 
ratio seems to be approaching the lower limit of 1/4 
(it has fallen to about 0.4 by x = 0.8). This can 
be explained if the proton becomes almost pure up 
quark in the region near x = 1. I have an argument 
that suggests why this ratio might most likely be 
either 1/4 or 3/2, but I confess I thought it up 
after I saw the results. The idea is the following: 
near x = 1, we have one fast quark with the rest 
squeezed into the wee region. As we discussed in 
the last lecture, the probability of this goes as 
(1-x)’, where the exponent y depends on the quantum 
numbers squeezed into the wee region. Now the net 
quark number is a good quantum number, and for the 
wee region it will either be 2 (if the fast parton 
is a quark) or 4 (if it is an anti-quark). We guess 
that it is harder to squeeze four quarks into the 
wee region than two, so that the four-quark case 
gives a larger exponent y. Then the leading 
behavior will correspond to the wee region having 
the quantum numbers of a two-quark system. This can 


carry isospin 0 or 1, for strangeness zero. (If 
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you wish to guess that the lowest y is for the wee 
region to carry strangeness, the argument generates 
the additional possibility that the ratio be 1.) If 
it is isospin 1, then the fast parton has to be a 
mixture of up and down quarks (so that the quark 
isodoublet is combined with the wee parton isotriplet 
to give an I = 1/2, I, = 1/2 state). This gives a 
probability 2/3 for the fast parton to be a down 
quark, and 1/3 for it to be an up quark, so that 
d(x) = 2u(x) for x near 1. With s, s, u, and d 
zero in this region we get ie ot = 3/2, which is 
ruled out by the data. The case where the wee quarks 
have I = 0 implies the fast parton is pure up quark, 
and gives the ratio 1/4. 

It is also interesting to consider the sum of 
the e-p and e-n distributions--I'1l1 call this the 


"e-d" distribution 


£89(x) = fo (x) + ue ae 
If we define 
q(x) = x[u(x) + d(x)] 


to be the momentum carried by non-strange quarks, 
and 

EO) = xcs bx). 
that carried by strange quarks (with similar defini- 
tions for the anti-quarks), then 


f59(x) = (B/9)Eq(x) + G(x)] + (2/9)EE(x) + E(x) 1. 


4) 


600 


Now common sense leads us to believe that the 
strange quarks do not carry as much of the momentum 
as, say, the down quarks. The net excesses are two 
up quarks and one down quark, with no excess of 
strange quarks. It doesn't seem reasonable that 
there should be a large number of pairs of s and 5 
in the sea of quarks and anti-quarks without a cor- 
respondingly large number of pairs of d and d. 
Anyway, it seems a fair guess that E(x)%(1/2)q(x), 
particularly when we are not in the sea region--say, 
for x greater than about 0.2. So we conjecture that 
ed 


the strange quark contribution to fo is at most of 


order 10-20% and likely smaller. Experimentally, 


1 
f £59(x) ax = 0.31 
0 


Now if all the proton's momentum were carried by the 
charged partons--in this model, the quarks--then we 


should have 


] 
f Cq(x) + q(x) + E(x) + E(x)]dx = 1 
0 


Solving these two equations for the amount of momen- 
tum carried by strange and non-strange quarks, we 
find that the strange quarks carry 74% of it. This 
is in violent disagreement with our common sense 
ideas. The alternative is to suppose that there 

are neutral partons, carrying roughly half the pro- 


ton's momentum. Various models have been proposed 
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in which these neutral constituents are the particles 
that intermediate the quark-quark interactions, and 
are usually called gluons. 

This brings up another question that deserves 
a short comment--namely, the question of understand- 
ing how the quark-quark interactions can account 
for the absence of free particles with quark quantum 
numbers. The idea that partons are quarks was ori- 
ginally suggested by Paschos and Bjorken--it was not 
my idea. When they suggested it, I was worried how 
it could fit into my picture. When an interaction 
knocks a parton backwards, it interacts only weakly 
with the other partons, and appears to be carrying 
quark quantum numbers off into the distance. One 
way out is to have long-distance harmonic potentials; 
then the interaction is weak at short distances and 
our picture of a collision is valid, while after the 
collision the force grows with distance and can 
keep the quark from getting out. (It is only neces- 
sary to have q@ or 2Mv large compared to the spacing 
of levels and not to the ultimate size of the poten- 
tial itself.) I do not favor this picture. Another 
possibility is that, as the Hamiltonian acts to 
generate the evolution of the system to the asymp- 
totic final state, it creates large numbers of quark- 
anti-quark pairs at intermediate rapidity, and these 
somehow associate to produce the usual quantum 
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numbers of hadrons. It is clear that neither of 
these proposals is like an ordinary field theory. 

If the quark theory proves to be right, we will have 
to learn a lot of exciting, interesting, and semi- 
paradoxical things. 

Now let's turn to the scattering of neutrinos 
by nucleons. In the case of electron scattering, we 
knew there was a photon propagator going as 1/aq°; in 
the neutrino case, we don't know whether or not there 
is an intermediate propagator or a direct coupling. 
That's an experimental question. For the sake of 
analysis, let's assume a direct 4-fermion coupling 
of the quarks to leptons. We will also ignore 
strangeness-changing interactions (that is, we set 
the Cabibbo angle to zero), and neglect the muon 
mass. With these assumptions, we can calculate the 
differential cross section for the scattering of a 
neutrino of energy E, producing a muon of energy 
E-v = E(1l-y) coming out. In units of Gs/2n, the 
result is 

do/dy = 2 
for a neutrino on a particle, and 
do/dy = 2(1-y)? 
for a neutrino on an anti-particle. When a neutrino 
scatters from a nucleon to produce ap, it must 
change a d into au or au into ad. Thus, for 


neutrinos on protons, 
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d20Pydxdy = 2xd(x) + 2(1-y)? xu(x). 
(The factor x comes from the fact that the s against 
the parton is x times the s against the proton as a 
whole.) For neutrinos on neutrons, recall that the 


distribution for down quarks is u(x), etc., so 


dea" dxdy = 2xu(x) + 2(1-y)? xd(x). 
If we knew these two cross sections sufficiently well 
as functions of x and y, we could determine the four 
functions u, u, d and d. Then we would be able to 
predict the anti-neutrino cross sections, which are 


similarly given as 


d20¥P /dxdy 2xd(x) + 2(1-y)? xu(x), 


i 


and 
420°" saxdy = 2xu(x) + 2(1-y)? xd(x). 

These formulas imply many relations we can 
predict and check, but we discuss here only those 
tests which can be made in the most immediate future. 
The targets would then be heavier nuclei which we 
can say contain roughly equal numbers of protons 
and neutrons. Therefore we define the cross section 
for scattering on an “average nucleon" by taking 
half the sum of the cross sections on proton and 


neutron. This gives 


do /dxdy 


a(x) + (1-y)*a(x), 


and 


q(x) + (1-y)*q(x), 
45 


do" /dxdy 
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where q(x) = x[u(x)+d(x)] is the momentum carried 
by non-strange quarks, as defined earlier. What 
do we know about these functions? First, we have a 


sum rule 


7 
J La(x) - G(x) ]dx/x = 3 


which simply states that the net number of quarks 
in the nucleon is three (called the Gross-Llewellyn- 


Smith sum rule). Furthermore, we have 


a(x) + G(x) = (9/5)89(x) - (2/5) Cx(x) + £(x)]. 
We expect the strange quark contribution to be small, 
say, less than 1/6 of the total, except for x near 
zero. In any event, (9/5) #59(x) is a strict upper 
bound for q(x)+q(x) (an observation due to Llewellyn- 
Smith). 

The sum rule for net quark number is difficult 
to test since the low x values are rather uncertain 
and are weighted by 1/x; in any case, the present 
experiments described at this conference by Perkins 
are at any energy which I could have thought too 
low. If we integrate d20"/dxdy over y to get q(x)+ 
(1/3)4(x), and d@o°/dxdy to get 4(x)#(1/3)q(x), 
then 1.5 times the difference of the integrated 
cross sections is just q(x)-q(x). Perkins reports 
that it looks like the current data give a value 


of 3.5 + 0.5 for the sum rule--so maybe we're 
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beginning to learn something about the real world. 
Let's also look at total cross sections. If we 


define 


1 ] 
Q ap q(x)dx, and Q = a(x)dx, 


then 
Gray = Q + (173)5 
TOT 
and 
Cro = 4 : (1/3)Q. 
This means that the ratio oro7/ roy can never be 
less than 1/3, and it will only be 1/3 if 6 = 0; 


that is, only if the wavefunction has no anti-quark 


M = 
TOT 


0.38 + 0.02, which means Q/Q = 0.05 + 0.02. Now in 


contribution. The preliminary data give Or oT/o 


our model, q(x) and q(x) are equal near x = 0. Ex- 
perimentally, their sum is around 1, so they both 
start at about 0.5 at x = 0. The small Q/Q ratio 
indicates that q(x) must fall off rapidity as x 


increases, giving a picture like: 


1. 1.0 


00.2 0.5 1.0 00.2 0.5 1.0 0 
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This means, that except for small x, we have 


d?o’sdxdy = (1-y)?d2o"/dxay , 

with d20” /dxdy being independent of y. Close to 

x = 0, on the other hand, the two cross sections 
become equal to about 

0.5[1 + (1-y)?] 
The sum of the two total cross sections is (4/3)(Q+0), 
and we can get a prediction for this from the elec- 
tron scattering data. For we have 
1 1 
a+ 0 = (9/5) | £S9(x)dx - (275) Ce(x) + EC) Idx 


Using the experimental result of about 0.31 for the 


integral of oe 


s then (4/3)(Q + Q) is about 0.74 if 
we neglect the strange quark contribution. The 
latter should not lower this by more than 10%. The 
data at the present time fit this number very well. 
Of course, when more detailed data are available, 

we will be able to extract the u(x), etc., and make 
more detailed predictions. I have discussed here 
mostly results for which data are now being taken, 

in particular by Barish et al., at NAL. Therefore 

we expect soon to get more accurate data at higher 
energy, so that the specific forms for d2o” or Yr dxdy 
expected by the model that partons are quarks may 

be tested. There is enough redundancy to indepen- 
dently check scaling and assumptions about the q? 


dependence of the weak interaction. We are, 
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therefore, at a very intriguing moment in high- 
energy physics where we can do an experiment for 
which we have such definite predictions and are on 
the verge of being able to decide about a very funda- 
mental property of strongly interacting systems. 

We have put this property in the language "charged 


w 


partons are quarks,” but other theoretical interpre- 
tations of the same results, say, in terms of current 
commutator rules, will not alter the fact that if 
the predictions work, something profound has been 
established that must be directly incorporated into 
future theory. 

Another experiment that poses some problems for 
the quark parton model measures ete” + any hadrons. 
We would expect this process to be described by the 


diagram 


e Q 
where the quarks subsequently generate the final 
hadrons. But in this case, the ratio of the cross 
+ - 


+ - 
section for this process to that foree *+ uu 


should be equal to the sum of the squares of the 
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quark charges; namely, 4/9 + 1/9 + 1/9 = 2/3. The 
data indicate a value greater than 2. Of course, 

if there were three "colors” of quarks, the predic- 
tion would be 3x(2/3) = 2, which is closer to the 
experiment. There are other reasons for introducing 
the color quantum number, and we might as well review 
a few of them here. One has to do with the spin- 
statistics problem. Consider a att: in the non- 
relativistic quark model this is made up of three 
u's in a relative s-state with all their spins 
aligned. This is impossible for real fermions 
according to the spin-statistics theorem. Now maybe 
if quarks can't get out, the theorem doesn't apply, 
but another way out is to make them distinguishable 
by, say, coloring one of them chartreuse, one laven- 
der, and one beige. (This is not the standard color 
notation.) The three values of the color quantum 
number can be incorporated into an SU3 symmetry. 
This then can be used to explain why the physical 
hadrons have 0 or 3 quarks, but not 1, 2, 4, ..., 

by assuming that we only get binding for a state 
that is a color singlet. The baryons will then 


always have one quark of each color. 
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LECTURE 4 
HIGH-ENERGY HADRONIC COLLISIONS 


In this lecture, I am going to discuss high-energy 
hadronic collisions. The idea of partons--the study 
of the wavefunction at high energy--was, as I 
explained, originally invented to deal with hadronic 
collisions. The development of the ideas was guided 
by some of the experimental features of hadron 
physics, and the net result was to predict some 
properties to be expected in these collisions at 
high enough energies. The observed fact of limited 
transverse momentum was used to get the idea that 
partons do not interact strongly when their relative 
rapidity is high. As we have seen in previous lec- 
tures, this led to the notion that the wavefunction 
was likely to scale and therefore, after the colli- 
sion is over, it was a very good guess that the 
products would scale. So the scaling principle was 
established in my mind in this way, and the idea of 
a plateau in the rapidity plot. These two ideas are 
really about all that I have been able to extract as 
far as hadronic collisions are concerned. Neverthe- 
less, in thinking about these things, I also consi- 
dered some other physical views of high-energy 


collisions, which corroborated with the ideas I was 
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getting from the parton view, some of which I will 
discuss here. 

For example, later on in this lecture I shall 
consider some suggestions about what is known as 
Regge behavior, by which I mean the power law fall- 
off with s of cross sections for processes involving 
an exchange of quantum numbers. I specifically 
exclude from that discussion processes in which no 
quantum numbers are exchanged; that is, those 
involving what is called “Pomeron exchange." These 
I regard as being elastic scattering plus diffraction 
dissociation. I want to separate the phenomena into 
diffraction effects and "true" inelastic processes: 

o(TOTAL) = o(DIFFRACTION) + o(TRUE INEL.). 
where 

o(DIFFRACTION) = o(ELASTIC) + o(DIFF. DISS.), 
This is based on a picture in which an incoming wave 
is absorbed in some finite region. The absorption 
is related to the true inelastic processes, and the 
elastic scattering is connected to it through the 
optical theorem--it is a shadow effect in which the 
incident wave undergoes diffraction into the shadow 
region. If the incident wave is not simply a funda- 
mental particle, but is structured, the various parts 
may be diffracted differently, leading to an excita- 
tion or breaking up of the wave. This we call 


diffraction dissociation, and will regard as distinct 
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from the fundamentally inelastic processes. I will 
discuss the diffraction effects in more detail in 
the next lecture, and give a mathematical definition 
of the separation of the cross section into diffrac- 
tive and true inelastic parts. The ideas I am going 
to discuss do not take into account the possibility 
(suggested by experiments in the last year) that the 
total cross section may be rising as some power of 
log s; I have not modified the arguments to include 
this kind of behavior. 

Now let's discuss the ideas that come from the 
parton picture of the wavefunction. The observation 
of finite transverse momenta has led us to a notion 
which has been used again and again--the idea that 
when the relative momentum of the partons is very 
large, their interaction is very small. Only finite 
relative rapidity and finite transverse momenta are 
generated by the interaction, which means the wee 
partons come from a long cascade and become indepen- 
dent, since there are so many steps in the cascade, 
of how it started. It seems to me that all these 
ideas are coming from one physical principle, which 
is that the couplings in the Hamiltonian are such 
that direct interactions among partons of high-rela- 
tive momentum are very small. If this is the case, 
then in a very high-energy hadronic collision, pic- 


tured as below: 
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the interaction proceeds through the wee partons 
since only these have finite relative momenta. My 
picture of how the collision proceeds is that the 
interaction of the wee partons disturbs the phase 
relationships in the hadronic wavefunctions, so that 
they come apart in some sense. That is, after the 
interaction, the relative phases between the partons 
are no longer those of the original hadron state, so 
the final state is a complicated superposition of 
hadron configurations. Of course, the partons them- 
selves cannot come out since they are not eigenfunc- 
tions of the Hamiltonian. The outgoing partons must 
distribute themselves in some way to form real 
hadrons. The way in which this occurs is generally 
local in relative momentum--not exactly, since an 
outgoing fast parton must use up some of the wee 
partons to form an outgoing hadron. But since the 
wee region is universal, the character of the hadrons 
that come out will depend on the character of the 
fast partons going in the same direction. This is 
essentially the idea of limiting fragmentation, 


which, when it was suggested by Yang at the Stony 
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Brook conference in 1969, I realized fit in with my 
parton ideas very well. 

These ideas are nicely represented on a rapidity 
plot, where we measure the rapidity to the right of 
the origin for particles moving to the right, and to 
the left of the origin for those moving to the left. 
If we have an incoming particle A with momentum P to 


the right, its rapidity is 


(1/2)1og(4p2/u24Q") , 


which is approximately log 2Py for large Pa. This 
will be the right end of the rapidity plot. If the 
other particle B has large momentum PR to the left, 


the left end of the plot will be at -log 2P The 


B° 
idea of limiting fragmentation tells us that the 
distribution of products which fall near the ends of 
the rapidity plot will be determined by the nature 

of the incoming particle corresponding to that end, 
and will be independent of what goes on at the other 
end. Now in between, at finite values of y, if the 
idea of a dx/x distribution is correct, the distri- 
bution will be flat on either side of y = 0, and will 
match in the middle. The matching is guaranteed by 
the fact that a boost along the longitudinal direc- 
tion would simply shift the plot by a finite amount. 
So we get two fragmentation regions separated by a 


flat plateau: 
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fragments of B plateau region fragments of A 


-log 2P, log 2P, 


The total rapidity difference between the ends of the 
plot is 
log 2Py + log 2P, = log 4P yP, = log s 

This picture of the breakup of the hadron wave- 
function due to disturbance of the phase relation- 
ships of the wee partons is not an honest conclusion 
of the parton model itself. What is missing is some 
description of how, given the initial wavefunction 
represented in terms of partons, the Hamiltonian acts 
to produce the outgoing hadrons. You don't see the 
wavefunctions of all the hadrons in this picture. 
Once, when I was working on liquid Helium, I went to 
a meeting and Pauli was there. He asked me to come 
to his hotel room to tell him about my work. I tried 
to describe to him the character of the wavefunction 
when you change the atoms around--how the amplitude 
had to shift, and how, when they were close together, 
it had to go down, and so on. After forty minutes 


he said to me, "Feynman, you've been waving your arms 
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at me for forty minutes. Stop waving your arms and 
write an equation." Well in this particular field I 
haven't gotten to the condition where I can write an 
equation--I'm still waving my arms after four years-- 
and I appreciate the lack! 

To summarize, the best way to put the physical 
assumption is not to say how it works, whether by 
disturbing the phases or some other mystery, but to 
say that the wavefunction after interaction is a 
wavefunction which represents a large number of 
hadrons coming out which are distributed in momentum 
like the partons in the qualitative sense of limited 
transverse momentum, scaling in longitudinal momen- 
tum, and a dP/E behavior for small is (central 
plateau in y). And that the quantum numbers of the 
hadrons are determined locally by the quantum numbers 
of the partons, so that the character of the parti- 
cles that come out in a certain direction depends 
only on the incoming particle that was going in that 
direction. Although I was not aware of it, every one 
of these features had already been suggested a decade 
ago by Kenneth G. Wilson (Acta Physica Austriaca 17, 
37 (1963)). 

I would like to present some of the arguments 
that led me to think that the wee partons were prob- 
ably the only ones that interacted, which is an essen 


tial part of the idea. First, suppose we represent 
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a particle moving to the right by 

|¥,> = F,*|0> : 
where Fy* is some operator which contains creation 
operators only for partons moving to the right and, 
of course, as we expect, wee partons. But suppose 
for the sake of argument that the wavefunction actu- 
ally turned out to have a vanishing number of wee 
partons as P + », Similarly, suppose 

|¥,> = Fax |O> 
is a similar particle moving to the left. The Hamil- 
tonian is such that these are eigenfunctions, so 

H|¥,> = El ¥,> ; 

HI¥,> = Egl¥_> 
That is, |¥,> is an eigenstate of the Hamiltonian 
involving only right-moving particles and |¥,> a 
solution containing only left. The Hamiltonian must 
be of such a kind as to permit such a solution. It 
is not rigorously necessary perhaps; but it is very 
suggestive that if such is the case,the product wave- 
function, involving as it does only one or another of 
these kinds of particles--each in a distribution 
which is a solution of the Hamiltonian, would also 
be an eigenfunction of the Hamiltonian. To take the 
simplest example, of course, assume the Hamiltonian 
can be split into pieces Hy and Hp such that Hy 


doesn't contain any of the operators in Fax, and He 
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doesn't contain any of those in Fax. If we consider 
the state 

F,xFpx|0> r) 
we have 


H(FaxF px) |0> 
= (Hy + Hp) FasF px |0> 
= Fax (HaF px|0>) + Fax(HpFax|O>). 


This last follows since our assumptions mean that Hy 
and Fax commute with Faxes and Hp commutes with Fax 
But then this is 

(E, + Ep) FaxFpx|O> , 
so the wavefunction is an eigenfunction of H. This 
Means that the state consisting of two particles 
moving in opposite directions is an eigenstate of the 
Hamiltonian--they don't interact, or better, the 
interaction falls to zero as s increases. This argu- 
ment will fail if there are some partons common to 
both particles, and a constant cross section suggests 
that the number of these, necessarily wee partons, 
does not fall with P. 

As an example to illustrate how the wee partons 
from the two hadrons mesh together, consider the 
model discussed in Lecture 2, where we had the 
wavefunction 

|¥ p> = exp{t c(k)a*(K)} | O>, 
with 
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e(k) = a/(w-k Jur!” 
for a particle moving to the right. We know this 
gives a dx/x distribution, and wee partons, so we 
expect interaction in this case. A left-moving par- 
ticle would have a wavefunction 
|¥,> = expe ct (k)a*(k)} o>, 
with 
c'(k) = a/(utk)u®/?, 
Now suppose we look at the wavefunction that results 
from multiplying these operators-- 
[Fqg> = exp{e Lo(k) + c'(k)] a*(k)} o> 
2s 2) 


The coefficient is (note w ky = ae + Q 


ne L 2 2 

c(k) + c'(k) = a/vw (uv + Q°), 
where Q is the transverse momentum. We see that the 
distribution has a dk 5/w + dx/x behavior. It is also 
easy to check that this wavefunction is not an 
eigenfunction of the Hamiltonian. Now this initial 
distribution of partons is disturbed by future 
applications of the Hamiltonian in producing the 
time evolution to the final state. 

One may guess that the energy dependence, at 
least, of the interaction cross section is measured 
by the degree that these initial wavefunctions over- 
lap; i.e., contain common partons. We may calculate 
the amplitude for this, <¥pl¥a>> which comes out as 


exp{-E ct (k)e(k)t. The quantity in the exponent 


60 


619 


becomes 
of dans? + 07)" ; 
a finite number. 

To return to the subject of Regge behavior, I 
want to discuss some physical ideas which, although 
not really from the parton model, go along with it in 
Supporting our conclusions. As we have said, the 
main feature we want to understand is why the cross 
section for processes involving the exchange of 
quantum numbers fall as some power of the energy. 
Look at a process like: 


+ 0 
T 1 


p n 
We see here that during the collision a current is 


suddenly reversed. The incoming neutron has 3- 
component of isospin -1/2, and this is flipped to 
give an outgoing proton with 3-component of isospin 
+1/2; the incoming pion has its 3-component of 
isospin suddenly changed from +1] to 0. We may ima- 
gine that the 3-component of isospin carried in by 
the m is suddenly turned around during the brief 
time of collision. Now Gell-Mann has been empha- 
sizing that hadronic systems are coupled to currents, 
and it seems very likely that the o, for instance, 


is coupled to the isospin current. So we can ask, 
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when there is such a tremendous time rate of change 
of a current, why don't we see radiation of the 
particles coupled to the current? The answer is that 
there is radiation--but when we are looking at a 
specific exclusive process such as the two-body 
charge-exchange one above, we are asking for that 
fraction of the cross section which has no such 
radiation. In other words, we are asking for the 
probability that there is a rapid acceleration of the 
isospin current without a corresponding radiation of 
the particles coupled to isospin. If you take a 
theory like bremsstrahlung, for example, it turns out 
that for a sudden reversal like this, you find more 
and more radiation as the energy goes up; in fact the 
mean number of particles radiated increases as log s. 
So the probability of no radiation is proportional to 
exp(-an)= s-?. In the theory of bremsstrahlung, this 
logarithmic growth of n is produced only if the par- 
ticles which are going to be radiated have a dk/w 
distribution. In general, we would like to say that 
the probability of a process which doesn't radiate 
any particles in a gap of rapidity Ay goes like 
exp(-ady). In a two-body + two-body reaction, Ay is 
of order log s and we get the s ° behavior of Regge 
theory. The constant a presumably depends on the 


quantum numbers being exchanged, since they are 


related to the current that is being accelerated. 
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(See note at end of this lecture.) 


Another point: consider the processes 


+t 
p A 
rs) + 
T T 
and 
p p 
+ 
T 
(e) + 
T T 


where in the second case the (mass)* of the pr 
system might be nowhere near the resonance region. 
The first case is just like the one treated, and we 
expect a power law behavior like s *. In the second 
case, if we consider the p and nt at specific x 
values (so that incidentally the invariant mass 
squared of the outgoing pr system is fixed), then 
in the rapidity plot we are still asking for no par- 
ticles in a gap which grows as 10g s, and the cross 
section again falls as s~*, The constant a will be 
the same in both cases, since the same quantum 
numbers are exchanged. More generally, the probabi- 
lity for any process like A + B + C + D should 


behave like 
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s“7F(MS) 
for Me finite, regardless of whether Dis a specific 
resonance or not, since finite Me means that D con- 
sists of a finite number of hadrons and there will 
still be a gap in the rapidity plot which grows as 
log s. Diffraction dissociation processes are being 
excluded from this discussion--when there are no 
quantum numbers exchanged, we have no reversal of 
current and no reason to expect radiation. In that 
case the physical situation is quite different. 

We can make a multi-particle generalization of 
the Regge ideas. Suppose we look at a process which 
is partially exclusive. For example, suppose in the 
final state we allow any number of pions but no K's. 
Then by the same logic, we expect the probability of 
this to be exp(-any), where ny is the mean number of 
K's expected to be produced in a completely inclusive 
collisicn. Physically, we expect the plateau region 
in rapidity to become universal as it widens, and so 
the mean number of any particle should grow as the 
width of the plateau; that is, like log s. Thus ny 
goes as log s so we get a power law fall-off for the 
cross section in this case, too, although the powers 
are not the standard Regge powers. 

The next case I want to consider uses the same 
logic but predicts something quite different. Con- 
sider the process A + B+ C + "anything," and suppose 
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that C carries almost al? the momentum of A; that is, 
Xo is nearly 1, but not exactly 1. In the rapidity 
plot, C will be at a y of order log 2Py. Now in the 
"anything" we can have some things that are also 
moving to the right, but they can carry a fraction of 
Pa which totals only 1-Xx,- So we have a gap in the 
rapidity plot from log 2P, + Tog(1-x,) up to log 2P,, 
giving Ay = ~log(1-x-)- The probability of this goes 
as 

exp(-ady) = (1-x,)* 
The distribution function will be the differential of 
this, and go as (1=%,)9" laxe. The gap carries the 
quantum numbers of C-A and so a can be determined 
from the Regge parameter a found in a two-body pro- 
cess with the same quantum numbers exchanged, via 
a = 2(1l-a). To test this experimentally, there are 
some technical questions of background, particularly 
from diffraction dissociation, which must be taken 
into account, and I don't know whether it has been 
checked very well. Since this idea was developed, 
there has been a very beautiful theory by A. Mueller 
which makes it mathematically understandable how to 
connect these processes. May I refer you to an 
excellent report on the theory of high-energy colli- 
sions, in general, and these points in particular, 
by Mueller at the 1972 NAL conference (A. H. Mueller, 


"Production Processes at High Energy," Proceedings 
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of the XVI International Conference on High-Energy 
Physics, Vol. 1, p. 347). 

There is another interesting consequence of 
this, which I have developed through conversations 
with Bjorken, and which seems to me an important 
principle. (It also follows immediately in Mueller's 
analysis.) Consider again the reaction A + B+C + D, 
where D now can be anything, including various reso- 
nances. Suppose the only property of D we measure 
is its invariant mass Ne For Me finite, as we have 
already remarked, the probability for the reaction 
goes as s3F (MA). Now if all the momenta are large, 
we have 

Ma = (DatPp-Pc)” ¥ s(1-xe) 
Thus finite Me must come from 1-X6 very small; that 
is, from Xo very close to 1. We have already argued 
that the probability of finding Xo near 1 goes as 
(1-x,)°. In order for these two distributions to fit 
onto one another,we see that we must have F(MA) go 
like (me)? for Me large enough to be in the transi- 
tion region where the distributions join. 

To see what this means, suppose we plot the 
invariant differential cross section, first as a 
function of 1-Xo> for two different values of s. 

As s gets larger, the resonance region shrinks in 


size, because of the s ? dependence, and is squeezed 


into a smaller region of Xo Since the transition 
66 


1-X¢ 1-X¢ 
Resonance Scaling Resonance Scaling 
region region region region 
Smaller s Larger s 


occurs at some 1-x¢ proportional to Ne/s. The two 
regions will still join smoothly to one another. Now 
look at the picture if we plot the distribution as a 


function of Ma. 


xs? 


Smaller s Larger s 


In this case the two pictures look alike, provided 
in the second we have scaled up the ordinate by ae 
The thing that is important about this is that you 
can never tell the resonances from the background. 
The scaling region and the resonance region join 
smoothly in such a way that we can't change the 
experimental conditions--say by going to higher 
energy--to isolate the resonances so that they become 


cleaner. There is a kind of "complementarity," if 


you will, between resonances and the background due 
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to scaling or something else. (This complementarity 
doesn't work in the case of diffraction dissociation, 
I believe, and I'11 discuss it in the next lecture.) 
The Drell-Yan relationship comes from the same prin- 
ciple applied to compare the deep-inelastic scat- 
tering and the deep-~elastic or deep-resonance 
production. 

What I mean by this “complementarity” is that 
different physical ideas are useful in different 
physical regions. For low Me, we have a picture in 
which the outgoing system is made up of resonances. 
As Me goes up, the picture becomes more complicated-- 
we get more and more resonances and they begin to 
overlap, so the resonance picture becomes hard to 
use (but presumably not incorrect in principle). But 
in this region another physical picture can replace 
it--the ideas of scaling that come from a bremsstrah- 
lung type of theory. The pictures are completely 
different, but nature doesn't care how you think 
about her. She fits together so you'll never make 
up your mind precisely where one picture stops and 
the other begins. 

Note relative to Regge behavior: 

We have suggested that in a two-body exchange 
collision, the perturbation expression for the 
amplitude for the exchange of a particle carrying 


quantum numbers may be altered by an additional 
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factor s ° because of the requirement that the 
rapidly reversed current not radiate. The Regge 
theory coupled with the idea that trajectories seem 
to have a common slope a' suggests that this addi- 
tional power is the difference of a at t = 0, a(0), 
and a at the mass of the perturbation particle 
exchanged, t = ue. Thus c = atar. Hence for 
exchange where ue is very small, there is very little 
correction (from the perturbation theory amplitude 
1/s), but for p exchange the correction is nearly 1/2. 
Yet the quantum number exchange, say, the 3-component 
of isospin, is the same. Can the radiation theory 
understand at least qualitatively (it does not quan- 
titatively) that the exchange of a light mass makes 

a smaller correction for lack of radiation than does 
the exchange of a larger mass? 

If you calculate by bremsstrahlung theory the 
high-energy behavior of the amplitude to exchange a 
particle of mass yp without radiating a vector parti- 
cle of mass m, the result is, for a collision at 
impact parameter b, 

amplitude = K (ub)exp {-97 (og s)K (mb), 
where g? is proportional to the square of the 
coupling constant of the radiation, and K (mb) is 
the Bessel function (two-dimensional projection of 
the Yukawa function exp(-mr)/r)) which falls for 
large b as exp(-mb). Now consider, for example, 
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scattering forward, the integral over b of the above 
amplitude. For small up (say up << m), large b is 
important where K (mb ) in the exponent is small and 
the effect of the radiation is small. On the other 
hand, if p is not small, the range of b is limited 

to regions where K (mb ) is appreciable and the cor- 
rection is larger. Thus we obtain the correct quali- 
tative result. 

Is there a suggestion here that data for nega- 
tive t at least to be analyzed empirically, not by 
assuming an inverse power law in s for each momentum 
transfer Q, Q? = -t, but rather an inverse power of 
S for each impact parameter b, ultimately Fourier 


transformed by exp(id-b) to get the Q dependence? 
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LECTURE 5 
DIFFRACTION DISSOCIATION 
AND SOME PROBLEMS FOR THEORISTS 


In the last lecture, I discussed the inelastic 
hadronic collisions from the point of view of the 
parton model, but I pointed out that I was separating 
the total cross section for all events into three 
kinds: elastic events, those inelastic events 
associated with the elastic which I called diffrac- 
tion dissociation, and the true inelastic events. 
The previous lecture dealt only with the last of 
these. If we picture a collision in terms of impact 
parameter, a target has a certain effective size for 
absorbing the incoming wave. At large impact para- 
meters, the incoming wave passes by undisturbed, 

but in the region behind the target, it will be 
altered in intensity and possibly in phase--a shadow 
is formed. We all know that as a result of this, 
the shadow diffracts. If the incoming wave were a 
simple particle, the diffraction would just produce 
elastic scattering, which could be related to the 
inelastic effects producing the shadow through the 
optical theorem. Now, as Good and Walker pointed 
out, as is generally true in other areas of physics 
such as atomic physics, if the incoming wave 


describes a system made up of parts, then there are 
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two ways in which the picture is altered. First, the 
amplitude for absorption by the target may be dif- 
ferent for the various parts, so that what is behind 
the target is a distorted wavefunction with a dif- 
ferent proportion of these parts. And second, even 
if all the parts are completely absorbed, the diffu- 
sion into the shadow from the edge will differ for 
parts of different masses, and so on. As an example 
from atomic physics, the diffraction of a hydrogen 
atom will result in a relative distortion of the 
proton and electron wavefunctions which can excite 

or ionize the atom. If this picture is right, then 
when the absorption is strong, most of the distortion 
of the wavefunction comes from regions near the edge 
of the target, while elastic scattering comes from 
the whole disc. For this reason, I expect the 
elastic scattering to be the biggest fraction of the 
diffractive effects--I will be surprised if the dif- 
fraction dissociation turns out to be bigger than 1/2 
the elastic. The diffracted wave, both the elastic 
and diffraction dissociation parts, carries the 
quantum numbers of the incoming wave. 

We can describe the diffractive part of the 
scattering in terms of a probability of exciting a 
state of invariant mass M, F(M2) am. The elastic 
scattering contributes a 6-function at the mass of 


the incoming particle. I have studied a number of 
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models of systems with parts which are differentially 
absorbed and have always discovered, to my surprise, 
that the probability of excitation is small and falls 
off very rapidly with m2. I would guess that in 
hardonic collisions, F(M2) falls essentially to zero 
by M of the order of a few GeV. Now, if all the 
views that I have regarding this are right, I can 
define the diffraction dissociation in an interesting 
way. Suppose we have the cross section for finding 
the products of a collision with fractions of the 
longitudinal momentum Xy2 Xo» etc.,and with trans- 
verse momenta G,> Q.. etc., at some total center-of- 
mass (energy)? s. We call this 
> > > 

0( X10, 3X5Q55.-.58) = o({x,Qhss), 
where by {x,Q} we mean some set of x's and O's. We 
will define the limit of this as s+~ to Le the 
diffractive cross section for this particular set of 


x's and Q's, and write 


limo({x,Q}3s) = op({x.0}). 


S73 a0 


Now at a given value of s, the total cross section is 


o(s) =f o({x,Q}3s), 
{x, 


5 
where the sum is over all allowed sets of x's and Q's. 


Let's suppose that the total cross section approaches 
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a constant at high energy: 


lim Oro (S) = OF," 


gs7o 


We then define the diffractive part of this to be 


Sp. = = o({x,Q}). 


{x,Q} 


These two cross sections, TT and o are different 


Do ? 
in general, since in the first we take the limit of 

a sum, and in the second, the sum of the limits. We 
expect that processes involving the exchange of 
quantum numbers will fall off as some power of 1/s, 
so that op(tx.d}) and therefore op, will involve only 
processes with no quantum number exchange. 


To clarify these ideas, suppose we see what we 


should expect for a certain class of collisions: 
p p 


ue 


Here we require that one incoming proton emerge as a 
proton with x near 1, while the other incoming par- 
ticle may break up to form a state of invariant mass 
M. If we considered only the true inelastic effects, 
we would expect a distribution that depended only on 
x by scaling. Since x is near 1, it is convenient to 
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use the variable x' = 1-x. The elastic scattering 


would contribute a 6(x'), which we take out. The 
2 


> 


diffraction dissociation will add some function of M 
which for large s and x near 1, is approximately 

m2 = s(l-x) = sx'. 
The overall distribution, ignoring the elastic part, 


looks like the sum 
f(x')dx + F(N2)aM2. 


If we want to plot this as a function of x', we write 


it as 
{f(x') + sF(sx'!)]dx', 


which gives a picture like: 


F(M?) f(x") F(M) F(x") 
Smaller s Larger s 
The part of this curve that is due to f(x')--the part 
that scales--looks the same in both pictures. The 
part due to F (M2) becomes higher as s increases, 
since we are really plotting sF(M2); and, moreover, 
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since a given range am2 corresponds to a range 
Ax' = AM'/s, it is squeezed into a smaller region on 
the x'-axis. Now suppose we plot the situation as a 


function of m2, For this we use 


[f(M2/s)/s + F(M2)]dM2, 


and the picture looks like 


Smaller s Larger s 


Here we see that the background shrinks with increas- 
ing s, and so it is possible to tell diffractively 
produced resonances from the background of truly in- 
elastic collisions (including pionization, for 
example) if my picture is right. The "“complemen- 
tarity" that I spoke about previously for Regge ex- 
changes is not expected here. 

If it should happen that the diffraction dis- 
sociation gives F(M2) am2 that goes as dM2/m2 for 
large m2, then this separation could not be made, 
since at high energy the distribution would become 


dx/(1-x), which scales. But I think it is very 
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unlikely that the probability of excitation of high- 
mass states falls off so slowly. (Such a term is 
expected in theories which allow a triple Pomeron 
exchange in the Mueller analysis, but with our inter- 
pretation of "Pomeron exchange" I don't know what 
this could mean physically and do not expect it.) 

Of course, to the diffractively produced reso- 
nances there will be a substantial diffractively 
produced non-resonant background, some of which is 
produced, for example, by the Deck effect. This 
effect is expected from our view for as it is usually 
described, say for a p-p collision, the incoming 
proton wavefunction does contain some amplitude to 
be, say, a neutron and a n (two bunches of correlated 
partons, if you will). The 1 is scattered by the 
other proton so the na system comes apart. This 
makes a strong non-resonant effect, which falls off 
rapidly enough with m2 (like dm27 (m2) 3, I believe) 
that it cannot be confused with a scaling background. 

I should like to summarize this part of my 
lecture with a list of good theoretical questions for 
theorists to worry about concerning hadronic 
collisions. 

-~~First, how do the parton-parton interactions oper- 
ate to cause the breakup of the hadronic wavefunc- 
tion? I have discussed the idea that the wee partons 


interact and disturb the phase relationships, but 
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we need some mathematical example to see how the 
mechanism works. 

--There are all sorts of questions about transverse 
momentum. Why are the transverse momenta limited, 
and how are they correlated? The wavefunction must 
contain this sort of information. 

--Theorists should try to understand how the cross 
section at high energy really behaves, instead of 
just saying it goes to a constant because it looks 
constant, or goes as a logarithm because it looks 
that way. 

~-We need to understand the mechanisms which produce 
large transverse momentum. The work by Kislinger 
reported at this conference, and work of Bjorken, are 
suggestions in this direction. If we understood the 
mechanism, then the large transverse momentum data 
would give us more information about the parton 
distributions. 

--How can we connect the distribution functions found 
in the deep-inelastic scattering experiments with the 
data on the products in high-energy hadronic colli- 
sions? We have data on the percentages of K's and 
m's produced, and so on, but as yet no theory to 
connect this data with what we already know about 

the wavefunction, or to enable us to use the data to 
learn more about the wavefunction. 

--Finally, an interesting exercise for theorists. I 
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have frequently mentioned bremsstrahlung in these 
lectures; my analogies were with the electrical case 
in which the charge that is accelerated radiates 
only particles that have no charge. The behavior of 
bremsstrahlung in a Yang-Mills theory, where the ra- 
diated particles carry the charges whose acceleration 
gives rise to the radiation, is an interesting 
technical question. The paradoxical feature is that 
if we consider a charged object being accelerated, 
and its radiation carries off charge, then the charge 
jtself may not have been accelerated, in which case 
there would have been no radiation...? 

Now I should like to go on to another subject. 
We have emphasized that experiments may very shortly 
tell us whether partons are quarks. Let us now con- 
sider the set of questions which arise if it should 
turn out that these experiments do support the 
hypothesis that partons are quarks. 
--What kinds of gluons are there? We have good rea- 
son to believe there are neutral constituents, and 
it is reasonable to associate these with the parti- 
cles that mediate quark-quark interactions. But 
what are their spins--are they vector particles, for 
instance--and other quantum numbers? How many kinds 
of gluons are there? And so forth. 
--How can we relate the quark-parton picture to the 
low energy quark model? The study of low-energy 
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resonances gave us a picture of baryons made of three 
quarks, which fits the evidence semi-quantitatively 
very well. Now we have a wavefunction in a rapidly 
moving coordinate system, and we seem to have the 
original quarks along with additional quark-anti- 
quark pairs. We can't connect the pictures very well 
because we have to know the Hamiltonian to transform 
from one to the other. 

--Why do quarks, which have spin 1/2, seem to obey 
Bose statistics? Why are hadrons states of zero 
triality? These questions could be connected if 
there are really three colors of quarks, and the 
physical baryons are color singlets, but we need to 
fill out such a picture with some understanding of 
the saturation of the quark-quark forces. 

--How can we explain the absence of free quarks? In 
particular, what is the mechanism by which the out- 
going quarks in a deep-inelastic collision "decay" 
into hadrons in such a way that quark quantum numbers 
don't get out? 

You won't be surprised to hear that I don't know 
the answers to these questions. However, I've con- 
sidered a number of them, and I'd like to outline 
some of the ideas. They are not a solution to any- 
thing--they just represent the struggle to under- 


stand which problems are more serious and what we 
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might be able to do about them. 

The first consideration I want to describe re- 
lates to the harmonic forces often invoked for the 
low-energy quark model. Suppose we take the idea 
more seriously--what are the advantages, and what 
problems arise? One advantage would be that the 
quarks could not get out and we wouldn't see free 
quarks. This assumption works fine in the non-rela- 
tivisitic model, of course, and also poses no diffi- 
culty for the arguments we have been making in the 
parton model. This is because the harmonic potential 
is soft for short distances; so when we calculate the 
probability of a parton going off in some direction 
as the result of a collision, we can neglect the 
binding to the other partons. Only when it gets far 
away, long after the collision has taken place, does 
it know that it is going to be pulled back. As lona 
as the level spacing of the harmonic oscillator is 


small compared to the q@ 


and v of the collision, all 
our parton calculations go through. There will be 
differences in the predictions of the products of a 
collision if there are harmonic forces, however, 
since in this case there are strong couplings of the 
outgoing parton to the others. 

Presumably,we have to have some sort of satura- 
tion of the harmonic forces so that two objects, each 


made of three quarks,do not have a strong long-range 
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interaction. This is analogous to the case of two 
electrically neutral atoms, but the analogy points 
up a problem. Even if the forces are originally 
zero because of saturation, a force could arise if 
the objects were polarized in some sense, and induced 
polarizations could be expected to occur, giving rise 
to a long-range force analogous to the van der Waals 
force between electrically neutral molecules. The 
rapid rise with distance of a truly harmonic force 
could make this a large effect, and this is a major 
disadvantage to be considered in model-building with 
such forces. 

There is another consequence that leads to a 


new idea. Suppose we imagine a process like 


e Q 


e* Q 
and consider the outgoing quark and anti-quark 
moving farther and farther apart. As they do this, 
the harmonic force between them implies that a 
tremendous field will develop between them; and when 
it gets large enough, quarks and anti-quarks will be 
produced in pairs by the field all over the place. 
These will then settle down into their stable satu- 
rated systems--which are the hadrons--producing 
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hadron patterns like those we see. In this view, we 
see that the harmonic force itself doesn't look 
harmonic at large distances. 

I do not consider the harmonic force as a par- 
ticularly good idea. But I do think this idea that 
it has led us to may be of real utility; namely, when 
a single quark is moving alone away from others (say, 
backwards in the deep-inelastic electron scattering), 
it produces large numbers of pairs of quarks roughly 
evenly distributed in rapidity (note I have artfully 
changed the original space distribution for a har- 
monic force, to one in rapidity, guessing that this 
is the proper relativistic representation) and that 
these manifold partons there assemble themselves into 
hadrons of zero triality. If this is true, many of 
the qualitative expectations for the distributions 
of hadrons in deep-inelastic collisions are valid 
even for partons which do not carry hadron quantum 
numbers. (See note at end of this lecture.) 

Whether the forces are harmonic or not, we still 
have to understand saturation and the question of 
triality. I have done some thinking about this in 
conversations with a student, Ken Kauffmann, and I 
am going to outline here the ideas we have discussed. 
I have since learned that such ideas are quite old. 
For example, they are discussed by Y. Nambu (Preludes 


in Theoretical Physics, edited by A. de-Shalit, 
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H. Feshbach, and L. Van Hove, North Holland, 1966, 

p. 133). A detailed discussion has been recently 
given by Lipkin ("Triality, Exotics, and the Dynamical 
Basis of the Quark Model,” by H. J. Lipkin, Weizmann 
Institute preprint WIS 73/13 Ph). 

In the case of electricity we get saturation-- 
most matter is neutral. This is easy to do with only 
two kinds of charge, but in the case at hand we need 
saturation at three. We eventually saw that to get 
this, we had to have something analogous to SU3. So 
we suppose the quark carries a quantum number of the 
SU. type, but distinct from its usual SU3 quantum 
number which distinguishes the u, d, and s from one 
another. The new quantum number takes on three 
values which we call colors, and will call here 
simply a, b, and c. In this model we have nine 
quarks, since we have up quarks of type a, type b, 
and type c, and similarly for down and strange quarks, 
We assume the quarks interact by a coupling of type 
ra Ape analogous to the oa*Sh coupling of spins, 


where the A's are the eight color SU, matrices, and 


3 
the dot product indicates a sum over all eight values 
of the SU, index. We can evaluate the couplings by 

interpreting the interaction in terms of the exchange 


of an octet of objects formed in the usual way. 


That is, if we use the notation of the ordinary meson 
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octet, we can exchange objects like 


n = (1/¥%6)(2GC - AA - BB) 
nm” = (1//2)(AA - BB) 


and so on. Consider the interaction between two A's 
(two quarks carrying color quantum number a--we 
ignore all other quantum numbers), <AA]A A, | AA>. 


We represent this by a diagram 
A A 


The exchanged object can only be n or 1°; the others 
don't couple to an AA vertex. The 1° couples to 
each vertex with a strength 1//Y2, contributing 1/2 
overall. The n gives -1/¥6 at each vertex, or +1/6 
overall. So the net coupling is 1/2 + 1/6 = 2/3. 


Similarly, the diagrams 


A B B A 
es 
1° 1 
A B A B 


give -~1/2 + 1/6 = -1/3, and 1, respectively. The 
coupling rules can be easily summarized by noting 
that 
. ey + 3 
x A (1/3) eae 
B5 
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where P., is the color exchange operator. (This is 
the analogue of Dirac's famous expression for 54°54 


in terms of the spin exchange operator.) For example, 


Data, lAA> = -(1/3)|AA> + [AAD , 


ae) 
<AA|A,* Ap | AA> = 2/3 
Also, 
da* Ap |AB> = -(1/3)|AB> + |BA> , 
ce) 
<AB|A,* A, | AB> = -1/3 
and 


<BA|\,°\,/AB> = 1 


as we calculated above. This makes the calculations 
very easy. For any system, add one for every pair 
symmetric under interchange, -1 for every anti- 
symmetric pair, and include -1/3 for every pair. 
When dealing with anti-quarks, we have to reverse the 
sign of the couplings, so that the AA system couples 
with -2/3 and the BA with +2/3. The AA system has 
amplitude -1 to turn into BB. 

Now suppose we look at the coupling for a 
quark-anti-quark system (meson) which is a color 


singlet. This will have equal amplitudes to be AA, 
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BB, or CC, giving a total coupling of (1/3)[3(-2/3 + 
6(-1)] = -B/3. We will associate the minus sign with 
binding, and a positive sign with repulsive forces. 
Color octet states like BA have coupling +1/3 and 
are not bound. Thus we get the desired result that, 
for mesons at least, the bound states are color 
singlets. 

Let's look at the states involving only quarks. 
We can classify these into color SU, multiplets, and 
use Young diagrams to represent the symmetries. In 
these diagrams, pairs in the same row are anti- 
symmetric under interchange, and those in the same 
column are symmetric. Consider first the two-quark 
system. The quarks are a color triplet, and 3 x 3 = 
3 + 6, so the multiplets we can get are represented 


by 


wl 
an 


For the 3, we have 1 anti-symmetric pair, which con- 
tributes -1, so the net coupling is -1-(1/3) = -4/3. 
This is bound. The 6, however, has a symmetric pair, 
so the coupling is 1-(1/3) = 2/3 and it is not bound. 
The three-quark states can be color singlets, 


octets, or decimets, with Young diagrams: 
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Each of these has three ways to choose a pair, so we 
get a constant contribution of -1 to the coupling. 
The singlet has all three pairs anti-symmetric, 
which adds -3 to give a net coupling of -4. This is 
strongly bound, and we see the two-quark 3 state 

is not saturated. It will tend to pick up a third 
quark and form a color singlet. The octet state, 
with one symmetric and anti-symmetric pair, has 
coupling -1, so the forces are attractive. But it is 
unstable against breakup into the two-quark 3 state 
with binding -4/3, plus a single quark, so it is not 
bound. The decimet has coupling +3-1 = +2 and is 
also not bound. 

An SU, Young diagram has rows of maximum length 
three. Clearly, the maximum binding is attained when 
we make the rows as long as possible (to get as many 
anti-symmetric pairs as possible) and the columns as 
short as possible (to get as few symmetric pairs as 
possible). For example, the four-quark state with 


the maximum binding has Young diagram 


BB 


All four-quark states have 6 pairs, contributing -2 
to the coupling, and this case has 3 anti-symmetric 
pairs and 1 symmetric pair. The overall coupling, 
-2-3+1] = -4, indicates this state is neutral with 


respect to 


Ea a Oe ee 


Similarly, the maximally bound 5-quark state is 


There are 10 pairs, of which 3+1 = 4 are anti- 
symmetric, and 1+] = 2 are symmetric, so the overall 
coupling is -(10/3)-4+2 = -16/3. This is neutral 


with respect to 


sie, apa 


The state with 3k quarks has 3k(3k-1)/2 pairs. 
The system with the greatest binding will have a 
Young diagram with k rows of length three, and so 


will have 3k anti-symmetric pairs. Each column has 


k(k-1)/2 pairs, so we have 3k(k-1)/2 symmetric pairs. 


The net coupling is 


-k(3k-1)/2 - 3k +) 3k(k-1)/2 = -4k. 
But this system is neutral with respect to k three- 
quark color singletstates (CLD), each with coupling 
-4. We see that any time the anti-symmetric trio 


Cor) can form it becomes neutral in its interaction 
B9 
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with other quarks and falls away as a baryon. If you 
wish all these bound states of zero triality to be 
near the same energy (say zero) compared to all 
others of very large positive energy, just add an 
energy of +4/3 for each quark and each anti-quark. 
So our scheme provides a saturation with an anti- 
symmetric three-quark system (which also solves the 
Bose quark problem) for baryons, and a color singlet 
state for mesons, with all other states unsaturated. 
We shall have to assume a large long-range force to 
ensure that this saturation is dynamically maintained. 
If we represent these forces in the conventional 
way by the exchange of vector gluons (which gives 
the right signs for particles and anti-particles), 
we have a theory with 17 particles, so it looks as 
if we are off again into a world of many particles, 
and the next generation will be working on the con- 
stituents of these! If we were any good at field 
theory, we could work out the consequences of the 
model and see if it is right or wrong. It is very 
amusing that we might have the correct theory and not 
know that we've got it--which is a funny condition to 
be in. However, I would expect that such an ordinary 
field theory will not give the long-range character 
to the interaction that we have had to postulate as 
essential to make this model keep the quarks from 


coming apart. Nevertheless, these leads, suggested 
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by a number of physicists, should be pursued more 
thoroughly. 

Note regarding products in deep-inelastic 
scattering. 

I had originally meant to discuss the product 
hadrons expected from deep-inelastic lepton-proton 
scattering. However, there was not enough time, 
and my views are all published in Photon Hadron 
Interactions (W. A. Benjamin Co., 1972, pp. 250 ff), 
with, however, one important change. On pages 260 
and 261, it is argued that in a situation with a 
single quark moving to the right, say, tne right 
moving hadron quantum numbers (such as charge, for 
example) would, on the average, be those of the quark. 
That this was not necessarily true was pointed out 
by Farrar and Rosner (Phys. Rev. D7, 2747 (1973)), 
and I would like to take this Opportunity to state 
publicly that I believe they are correct and that 
observation of the total quantum numbers of the 
hadrons moving to the right need not directly give 
those of the quark which generated them. What is 
involved may be most easily seen by considering 
the special case ee” producing a pair of quarks, 
say a quark Q to the right and an anti-quark Q to 
the left in a rapidity plot (see figure). Pairs 
of quarks are formed and gathered into hadrons 


according to our ideas. The triality (net quark 
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number modulo 3) of the quark to the right is +1, 
whereas no matter how many hadrons are formed to the 


right, the left over triality is still +1. 


Ty Oy etc. 


It is true that the plateau is neutral in the sense, 
for example, that the mean numbers of K* and Kare 
equal (so that charge may be conserved as the plateau 
of length log s widens). Nevertheless, this plateau 
may be polarized in the sense that there is a cor- 
relation, say, for example, that whenever a Kis 

found it is more likely that a nearby K* is to the 
left of the K than it is to the right. That such a 
correlation with a sign is possible is because the 
plateau has a sign, it is carrying +1 triality from 
right to left. That is, to put it invariantly, of 

a pair Ks the Kis more likely to be found near 
the quark end. For an anti-quark to the right this 
would be reversed, of course, with K" to the right of 
K-. Thus for our case, anti-quark to the left, on 
the left side of the plateau we have K* to the left 
of K. But this is just what we found for the right 


side of the plateau. Thus the two parts, left and 
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right, of the plateau fit together perfectly and the 
polarization continues uniformly across the entire 
plateau. This polarization, of course, carries quan- 
tum numbers (in our example, strangeness) so that the 
total quantum number of the right movers with rapid- 
ity above a certain point (say 0 in the center-of- 
mass) need not equal that of the quark originally 
moving in that direction. Isospin symmetry permits 
the original rule to work for isospin, but it need 
not work in general for strangeness, and, hence, 
charge (unless exact SU, were assumed, an unlikely 
hypothesis for these products). These considerations 
do not affect the expectations for hadron-hadron 
collisions, for there we have no necessary quantum 
number passed by the plateau and it is unpolarized 

as well as neutral. 

Note added in proof. 

Having only half a lecture on the theoretical 
questions, I see I didn't bring my discussion to a 
focus - so I would like to add these supplementary 
remarks. If we bring all these ideas together, we can 
make a “try out for size" model which seems quali- 
tatively, at least, capable of controlling all the 
paradoxical aspects of the quark model. The choice 
of character, u, d, s, is one that any quark can 


have, but in addition to that we have a new perfect 
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SU, of color. Quarks are spin 1/2 fermions of three 
colors with perfect symmetry. The interaction is via 
eight vector meson gluons coupled symmetrically as in 
the theory of Yang and Mills, to eight currents cor- 
responding to group generators, so that the SU, color 
character is not broken in any way. A ninth gluon 
coupled to “color singlet" is explicitly assumed not 
to be present (in this trial model). But, unlike the 
conventional Yang-Mills or other field theory, sup- 
pose the force is very long range, a potential energy 
rising without limit with distance. 

How this long range can come about we do not 
know, and this is the most serious question in our 
model - but for definiteness, take the suggestions 
(made to me by Kenneth Kauffmann) that the propa- 
gator of the fields is 1/k* instead of 1/k2 (and 
disregard “ghost" problems for now). The potential 
from a fixed point-charge is then r instead of 1/r. 

This force between colored objects, with poten- 
tial ever rising with distance, means that such ob- 
jects can never come apart. The only objects which 
can come apart are neutral with respect to all eight 
currents and are therefore color singlets. These 
are states with the quantum numbers of three quarks 
of colors, a, b, c, in an anti-symmetric state in 
baryons, or quark-antiquark in 1/¥3 (aa + bb + Cc) 
in mesons. The gluons themselves also cannot come 
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far out of a system, of course, because they them- 
selves are colored. (Murray Gell-Mann has pointed 
out to me that this probably removes my worry, 
expressed on page B2, Lecture 5, that there might 

be large polarization forces - the effects are second 
order in the interaction, an exchange of two gluons 
which interact may be the exchange of a massive 
object, and hence of limited range.) 

Sometimes one attempts energetically, to sepa- 
rate color as in the ee > QQ, perhaps an a to right 
and a to left, or in the deep-inelastic scattering 
in which one quark is taken from the others and 
sent violently backward (I am investigating whether 
high-energy hadron collisions may also involve this 
effect). In such cases, as the colored parts sepa- 
rate, a force develops between them as a result of 
gluon exchange, and in this force field pairs of new 
quarks are created until ultimately they are gathered 
into singlets which can separate indefinitely, pro- 
ducing the characteristic plateau in rapidity, etc., 
that we expect to get. 

The fact that the non-relativistic quark model 
works roughly, and the confirming fact that there is 
little momentum in quark pairs in the parton distri- 
bution of a parton, indicates that forces are soft 
at short distances. It is not obvious that this 


comes out automatically from our model and something 
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else may have to be added. It is amusing, however, 
that Kauffmann's propagator 1/k4 is weak at large K? 
(i.e., the potential r is not as singular as l/r for 
small r) and might account for the effect. 

I should like to recommend a very interesting 
paper by Casher, Kogut and Susskind (Phys. Rev. Let- 
ters 31, 792 (1973)) that I have just seen. They do 
a problem of fermions in quantum electrodynamics, 
but in one space dimension. In one dimension the 
potential from a charge is |x] and automatically 
rises with distance. It is confirmed that the phe- 
nomena of the type we expect occur. One should read 
this paper for a very good detailed physical descrip- 
tion of this, following ideas of Bjorken. It is very 
good to have a mathematical model of these ideas, to 
make them clear and definite so one can advance to 
further ideas. By choosing the mass of the fermions 
to be zero, they solve the problems exactly (after 
Schwinger). The case that the fermion mass is not 
zero cannot be solved exactly, but we expect, of 
course, the same type of phenomena of production of 
fermion pairs, since the potential rises with dis- 
tance. A study of such cases should be pursued to 
teach oneself the physics of these phenomena. A 
next step, after the case of non-zero fermion mass, 
would be an SU, system, Yang-Mills, in one dimension 
where baryons (three-quark states), and mesons would 
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form. Next, attempt to jump to 
sions with some guess as to how 
works. You may not end up with 


theory, but you have a definite 
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three-space dimen- 
the long-range force 
exactly the right 


program to develop 


from which you must learn a very great deal about the 


real world. 
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1. Introduction 


In 1954 Yang and Mills [1] wrote a paper in which they made a theory in 
analogy with QED for a system in which a particle could carry more than one 
“charge”. In those days there were particles like protons and neutrons which 
were thought to be one object — a nucleon which appears in two guises — pro- 
ton and neutron. This is the theory of isospin in which the analogue of charge 
is /,, the z-component of isospin; + 3 for proton, —4 for neutron. Then there 
was the question of what field, analogous to the photon in QED, could inter- 
act with such a charge. In the case of QED the photon is a vector field; there- 
fore Yang and Mills tried to make a theory of a vector field interacting with 
“charges” that might have more than one value. There had already existed for 
some time a theory of (pseudo) scalar fields which could interact with particles 
of different ‘“‘charge”’ but their properties weren’t as interesting as those of 
vector fields. Since the “charge”’ can be flipped back and forth, the fields which 
are coupled to the “charges” are more interesting than that of QED. Finally 
the theory had very great beauty and simplicity. 

We human beings see in a symmetrical theory a certain beauty; the Greeks, 
for example, saw in the theory of the planets that they went around in circles 
at a uniform speed, a phenomenon which today, we would characterise by a 
group theoretic property: the orbit is such that a displacement in time is 
equivalent to a rotation. 

Today we still have this desire to see symmetrical things and therefore the 
Y—M theory looks very good. In contrast to QED, here we have a field with 
many components which couple to different “charges”. This is because the 
field, in addition to the neutral component which couples to + and — charge, 
also has charged components which flip one “charge”’ into another. Therefore 
in the case of isospin we have a source of isospin } (the nucleon) and a field 
of isospin 1. Now this theory was beautifully symmetric but it did not agree 
with experiment; although isospin is almost exactly conserved the theory is 
similar to electrodynamics in that the mass of the vector field is zero, but there 
is no obvious long range force between nucleons: so it’s wrong. The first hope 
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was to put a mass in somehow, but that destroyed the symmetry and conse- 
quently the beauty. What were the vector particles anyhow? They were sup- 
posed to be p mesons, but there seemed to be nothing more fundamental 
about the p meson than all the other hadrons. So the idea that one of these 
hadrons was the fundamental field was lost. 

Later people such as Goldstone [2] began to look at broken symmetries; 
Higgs [3,4] and Kibble [5] found that a massless vector theory with broken 
symmetry was in some sense equivalent to one with mass. Today the only 
symmetries we see in nature are isospin (a near perfect symmetry) and SU(3) 
(a clear, but imperfect symmetry) and we would, therefore, think that, be- 
cause we like symmetries so much, the excitement of the day would be that 
we had an understanding of these symmetries at last. We do not. In fact, the 
Y—M theory with broken symmetry is assumed to apply somewhere else. 

In the meantime, there was developed a weak interaction theory in which, 
in one interpretation, one had vector mesons with mass. Taken directly, a field 
theory of massive vector mesons is highly non-renormalisable. Such a theory 
works fine as long as we only work with first order diagrams. However, at- 
tempts to go to higher order lead to unremovable divergences. It is necessary 
to go to higher order for two reasons: 

(1) It is not sensible to have a theory which only works to first order. 

(2) Consider a process where the amplitude is calculated to first order in g. 
The probability of the process occuring is O(g2). Therefore the probability of 
non-occurrence is ~ 1 — O(g2), Hence the amplitude for non-occurrence 
~J/1 — O(g*) ~1 — O(g2). Therefore one must know something about am- 
plitudes to order g2. It is, therefore impossible to have a theory which only 
works to first order if one wants to conserve probability. People did not worry 
about this until recently. When they did, they found that if they started with 
one of these symmetric theories and used the Higgs mechanism to add mass 
they might be able to represent these vector mesons (intermediate vector 
bosons) with mass in a way that was renormalisable in the same sense as QED. 
This was subsequently proved in detail, and the consequences of these theo- 
ries are, therefore, as calculable as those of QED. 

Sunsequently it was found by Llewellyn Smith [6], that if one starts with 
massive vector mesons and requires renormalisability one is driven to YM 
theories with broken symmetries, provided one is prepared to introduce new 
particles to cancel divergences. One has a choice of such particles and a par- 
ticular choice leads to a particular model. It is important to realise that there 
is no unique prescription for doing this. One can also use different symmetry 
groups and different methods of breaking the symmetry. It is therefore, not 
true to say that these theories make an unambiguous prediction of the exis- 
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tence of neutral currents, because one can always take a different theory which 
has no neutral current but some other new particle(s) (e.g. “heavy” lepton). It 
appears that there is now experimental evidence for both neutral currents and 
heavy leptons. 

In hadron physics the quark theory was evolved and found to be paradoxi- 
cal. Protons are supposed to be made out of three quarks as is the A** which 
has spin 3. Consider the case where J. = +3: 


u usu 


i a 


The dynamical theory says that the quarks are in a relative S-state in order to 
get the right order of magnitude for matrix elements and magnetic moments; 
then we have three particles in the same state. There is a problem in that it had 
becn proved that we can’t put three spin 5 particles in the same state. This 
proof assumed that the particles could be separated from each other. We also 
know that quarks don’t seem to appear as free particles. It is, therefore, not 
clear that the proof holds for quarks. Nevertheless, to be conservative, we will 
accept the theorem, and so a simple explanation of the problem is that the 


three quarks are different, i.e. we assign them a new quantum number (colour) 
which takes three values A, B, C: 


Ua Ug Uc 
i St She 


There are, at present, only two places in the experimental world where the 
colour hypothesis can be checked: 

(1) A subtle test is connected with the anomaly in the 19 > 2 decay. It 
turns out that a theory without colour gives a decay rate a factor of three too 


small; a theory with three colours agrees with experiment. 
(2) The ratio 


~ o(e€ > hadrons) 
o(e€ > wp) 
is equal to the sum of the squares of the quark charges and equals 2/3 for u,d.s 
without colour. In fact, the data shown symbolically” in fig, 1.1 are in disagree- 
ment with this value. u, d,s with colour give R = 2 in better agreement below 4 GeV. 
Another possible experimental test of colour is lepton production in pp col- 


lisions by the Drell—Yan mechanism but the evidence is inconclusive. The 
amount of experimental evidence for the beautiful symmetry of colour is not 


* See the lectures by G. Wolff and BH. Wiik for a discussion of the data. 
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Fig. 1.1. 


great but is theoretically very strong because it helps to explain a number of 
other features e.e. why three quarks in a baryon held together. 

This theory of colour is so symmetrical that a good guess is that hadrons 
are made of quarks with flavours u, d,s, c,... and three colours A, B, C with 
exact colour symmetry. It turns out that the colour couples to an eight com- 
ponent field. The reason why there are eight components is as follows: 

When a field quantum (gluon) is emitted, the quark colour may or may 
not change. There are nine ways of coupling a gluon between an initial (three 
colour possibilities) and a final (three colour possibilities) quark. 


A>B 

A>C AvA 

BoA B>B 

BoC Cre. 

C-A 

C>B 
But the linear combination of the components of the vector field which cou- 
ples equally to all the quarks (the singlet) need not be in the theory, all its 
properties are independent of the other eight components. Under a linear 
transformation of the colours, the eight mix together so all are necessary, but 


the singlet stays unchanged. Hence we are left with eight components. (Whether 
we add the ninth or not, and with what coupling, is up to us, but we will leave 
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it out as apparently unnecessary at present.) This theory with exact SU(3) 
colour symmetry is called quantum chromodynamics (QCD). 

At first sight we have a problem, viz., massless vector mesons would imply 
a long range force between quarks and so we could separate them. Experimen- 
tally this does not seem to be true. One solution to the problem is that QCD 
is wrong. Another way out is to say that we do not understand the conse- 
quences of Y--M well enough and that at large distances the forces might be- 
come large enough to confine the quarks, That is the foremost problem of 
QCD. Also there are infrared divergences in QCD which are more serious that 
in QED and the method to handle them is not yet known. In spontaneously 
broken Y—M theories these infrared problems are absent. 

To summarize, there are two applications of Y—M theories: with broken 
symmetry in weak interaction (e.g. Weinberg [7] — Salam [8] model) and in 
strong interactions by QCD with perfect unbroken colour symmetry. 


2. Classical Yang—Mills theory 


We are going to take a more or less elementary and direct view of Yang— 
Mills theory, rather like the authors did. We start with the example of SU(2) 
in which we have a 2 component spinor representing proton and neutron. We 
start with the Lagrangian density for the free proton and neutron fields 


Lp -_ Wp dVp + iV OV, a MV Vp == MyaVaVn > (2.1) 
where we use the notation of Bjorken and Drell [9]. 
In the old days people wanted to add interactions with pseudoscalar parti- 


cles. Consider 3 pseudoscalar particles (e.g. the pion) with charges 0(¢9), 
+1(¢,), —1(¢_); the simplest coupling to the nucleons is 


£, =i{0V $075 %p + BU nP0YS Unt Wpbe75¥nt Wrd_V5Vp}- (2.2) 


If the nucleon—nucleon force is independent of the nucleon type (and y #0), 
then 


a= —8= AVF. @3) 


A neater way to write the Lagrangian in this case is to introduce a spinor 
W, (a = p, n) and an isovector $; (i = 1, 2, 3) in isospin space, where 


b= t 0), — =HOL-0_). 63 =0 C4) 


If m,, =m, the Lagrangian £, + £; must be invariant under a rotation of the 
W field in isospin space, i.e. 
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Vp > cos Oy, + sin OW. Y, > ~sin OY, +coséy,, (2.5) 


provided we also rotate the ¢ field at the same time. The Lagrangian becomes 


L=ivaw — myy+ ave: tys¥ + K.E. terms for @¢ , (2.6) 
where 

0 1 QO -i 1 O 
2r1=({ . 2ry =(( a) 2r5=( 4 wae (2.7) 


are the Pauli spin matrices. 

This is the famous pseudoscalar meson theory which was supposed to be 
the explanation of everything. When there were summer schools, professors 
explained that this was the key to the whole of strong interaction theory; they 
were going to explain scattering and everything else and it was just a matter of 
calculating the next order on a machine. But that failed so they keep on trying! 
The question now is, can we write a similar theory for a vector particle inter- 
acting with nucleons. It is easy to guess that a vector particle A, could be cou- 
pled the same way i.e. 


L=ipay — mby+ A,-t)y,V, (2.8) 


where A has been rescaled to absorb the coupling constant g. The 7, is present 
to contract with the space time index on the A, and the t is present to con- 
tract with the isospin index on A. We could also add derivative couplings on 
the form 


Y,0) 757, » (2.9) 


in the pseudoscalar case, and similarly in the vector case. However for simplici- 
ty we exclude these and other more complicated couplings. (There is no way 

a priori to exclude these couplings but theoretically there may be problems 
with renormalisability but in the old pion theory of nuclear forces they were 
all tried.) 

Now the problem of coupling the vector field is very easy but it is only part 
of the real problem which is — what are the equations for the propagation of 
the field? Yang and Mills treated this problem in analogy with electrodynamics. 
In electrodynamics, the piece of the Lagrangian connected with the propaga- 
tor of the vector fields is 


A 


uv pv? 


where 


Fe oyA, HOA (2.10) 


GAUGE THEORIES 129 


The piece of the Lagrangian analogous to (2.8) is 


£= Wid - mb + VA, v. (2.11) 
Why do we have such a funny looking business? Why not have, for example, 
(0,4 ,)(0,4,) : (2.12) 


Electrodynamics has a property of gauge invariance; this means firstly if 

y > ely and a is a constant, nothing happens to (2.11). But now suppose 
that the phase of the wave function is changed by different amounts at differ- 
ent space-time points i.e. ais a function of x. This is usually called a local gauge 
transformation. Now the kinetic energy term in (2.11) will change since 


Wd, Y >ivd,y + VO ay . (2.13) 


One can easily make (2.11) gauge invariant by supposing that at the same time 
we change 


A, 7A, —d,0. (2.14) 


Now if the theory is to be gauge invariant we cannot use (2.12) in the Lagran- 
gian as it changes under the gauge transformation; however F,,,,, is invariant so 
that FF, is a possible invariant contribution to the Lagrangian. 

Now we can use the same trick to try to find what kind of invariant we get 
in this new theory with the multiple component field (isovector). Consider 
the transformation 


y > exp{—i(art )}y , (2.15) 


applied to (2.8). The transformation has to be unitary and consequently it can 
be written in the form (2.15). We only consider the transformation to first 
order in a, viz. 


yr U—iet)y. (2.16) 


(It can be shown that if the theory is invariant under infinitesimal] transforma- 
tions then it is also invariant under a finite transformation, (2.15) since this 
can be built up from infinitesimal transformations.) Eq. (2.8) is invariant if w 
is not a function of space-time provided 


A,,*t > exp(—ia-t)(A,,-t)exp(ia-t) 


A, tt =A,t —ilart,A,-t] =A,et t(@XA,) ot 
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Ai, =A,+(aXA,), (2.17) 


which is an infinitesimal rotation of the vector A ,, in isospace. 

So far we have considered only SU(2). In another group one should have a 
similar sort of thing except that the 7’s would be a different set of matrices, 
with another set of commutation relations. In general 


[7,, 7] = if Tk (e.g. for SU(2), Fix = 7x) - (2.18) 


We can generalise the notion of a vector to any group a; =a where c =a Xb 
means ¢c; = Fina; Og. 
Then all the equations we write are completely general, they apply not just 


for SU(2). Some properties of the cross product are 
a:(bXc)=(aXb)-c, aXa=0, aXb=—b Xa, 
aX(b Xc)+bX (ce Xa) +e X (aX b) = 0 (Jacobi identity) (2.19) 


but note a X (b X c) = b(a +c) — c(a- b) holds only for SU(2). For any group 
SU(), the number of components of A,, is n2 ~ | because the transformation 
parameter, a can have n2 — 1 components (this being the dimension of the 
group). 

Now consider @ in (2.16) to be a function of space-time. The idea is that 
we should be able to change the phase of the wave function arbitrarily at each 
space-time point and transform the vector field such that the physics does not 
depend on this choice. We must choose the transformation of A, so that (2.8) 
is invariant under the transformation (2.16).« is called the gauge parameter. 
Therefore under 


Y>(l-ialx)-t)y, D7, 8,0 >i, 8,0 + Pye) tH. (2.20) 
So A must transform like 
A, 7A, +aXA,— da. (2.21) 


The next problem is to find a Lagrangian term for A,, which is invariant when 
A,, is changed in this manner. There are very beautiful and elegant ways of 
getting these things these days; but suppose that you were inventing it, what 
would you do to find an invariant form? You fiddle around. All the elegant 
stuff is found later; the way to learn is not to learn elegant things, it’s to fiddle 
around blind and stupid. Later you see how it works; polish it up; remove the 
scaffolding and publish the result for other students to be amazed at your in- 
genuity. 

For the moment forget the a XA,, term in (2.21), and try to find some ex- 
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pression like the square of the electromagnetic field tensor, i.e. 
(0,4, —9,A,)?. 
This transforms under (2.21) as 


0,A,— 0,A), =0,A,— 0,4, + aX (0,A,—9,4,) 


+(0,a)XA,— (0,4) XA, . (2.22) 
If the last 2 terms were absent we would be O.K. because then 
0 fA OAs 
would transform as an isovector (i.e. V > V + aX V) and therefore its square 
would be invariant. So we must try to get rid of the last 2 terms. Notice that 
the gradient of a is coming from the transformation of A , (2.21) so that if we 


had a term like A, X A, then when we transformed it we would pick up a 
A, X da. Try 


Al, XA, =A, XA,+A, X(@XA,)—A, XO, 


+(aXA,)XA,—(0,a)XA,. 
But 
(aXA,)XA,=A,X(A, Xa). 
Add and subtract aX (A, XA,) and use the Jacobi identity (eq. (2.19)) to get 
Ai, XA, =A, XA,— (0,0) XA,—A, X (0,4) + aX (A, XA,). (2.23) 
We can now get rid of the debris between (2.22) and (2.23) by defining 
E,, =3,A,—9,A, +4, XA,, 
and so 
E,, > Ey, + aXE,, (2.24) 


Hence we can make an invariant quantity £,,,,° £,,,. Therefore we may write 
the Lagrangian density (in analogy with QED) as 
l 


Beg ue alt WAY RN A GN (2.25) 


Note if we rescale A, >8A,, £ becomes 


£= ~ZE,y Ey, tivay ~ my + edly vy, 
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where Eis is now 
Ey = 9,4, — 0,4, tgAy XA, 


This is the form given by Abers and Lee [10]. 

You may ask why use £,,,, £,,,, in the Lagrangian instead of another in- 
variant piece? For example E, » called the dual of £.,,,, defined by E., 
Ev vpo~pe is such that the quantities E+ Eand EE are also invariant. ’ The 
former is a pseudoscalar and one could consider it as a possible additional 
term in £. It can be shown that this term has no consequences for the equa- 
tions of motion provided that when A is varied to obtain them one assumes 
that as usual there is no variation of A at o. Possible consequences of a viola- 
tion of this condition will be discussed later. 


Now consider the action S derived from the Lagrangian (2.25) 
Ss St qaEnw E,,+ UGG +A “ty — Umy}dtx, (2.26) 


We find the equations of motion by varying the action. Varying with respect 
to W gives 
(id +A -t)Y=my, (2.27) 


and a similar equation for W is obtained by varying with respect to y. Varying 
with respect to A gives 


La, Et Ay eh) dps (2.28) 
a 

where 

JEW ey. (2.29) 
Any vector will transform as 

o> db+aXo, (2.30) 
The derivative of a vector transforms in a more complicated way: 

8, b> OO + aX I+ I,aXm. (2.31) 
Therefore if @ is a vector, its derivative is not. However we notice that 

d,O+A X o, (2.32) 


transforms as a vector. Hence we define a covariant derivative 


D, =, +4, X) (2.33) 
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which when operating on a vector produces a vector. We can now re-write 
(2.28) as 


1 
PE 24 - (2.34) 
& 
Consider the action of the commutator [D,,,D,] on @. We easily see that 
D,D,o- D,D,o=E,, Xo. (2.35) 


This is the first time we have got this combination of A’s (viz. E,,,,) out in a 
logical way. 

Since Buy is an isovector, we can substitute it for @ in (2.35). Then we get, 
using the antisymmetry of E£,,,, 


DDE i H0% (2.36) 
Comparing this with (2.34) we see that for consistency we must have 
DiS 0: (2.37) 


In other words these field equations are meaningless equations unless the cur- 
rent is conserved in the sense that its covariant divergence is zero. This causes 
alot of complications when we go to the quantum theory. The reason is that 
in the quantum theory, when we calculate a diagram and so forth, some par- 
ticles in the theory interact and provide a contribution to the current which 
is then a source which generates a new field propagating to the next interac- 
tion. We figure out how the vector fields propagate by solving the differential 
equations (2.34). It may not be that our source automatically satisfies (2.37), 
and hence eq. (2.37) does not make sense, it has no solution and we do not 
know how the field should propagate. Eq. (2.34) is the analogue of the Max- 
well equations* 


Oy Fay =I + (2.38) 


The current J,, is produced by the matter field and it is a consequence of the 
Dirac equation (2.27) (and its conjugate) that this current is indeed conserved 
and satisfies (2.37). If matter is a Dirac spinor as we have assumed, J, does 
not explicitly depend on A,,. This is not true in general, if we define J,, as the 


* The analogue of the other 2 Maxwell equations, which are an algebraic consequence of 
Fy being a curl, is 
DyEps +t DpEop + DoEpp =9- 


It is easy to check that this is an identity satisfied by £,,, since it is of the form 3,A, — 
dpAy +Ay X Ay. 
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first variation of the matter term (in the action) with respect toA,. The cur- 
rent would have a different form in, for example, scalar electrodynamics where 
the quadratic term is the Lagrangian has the form 


+ a+ 
(0, +A, )¢ (0, +-4,)o+m oo. (2.39) 
The current is found by differentating this with respect to A and is 
o 
ee + 
J, = 0,6 +26°A,O, (2.40) 


and so in this case there is an extra non-matter term in the current. Let us write 
igh op tay AAs (2.41) 


where F,,,, looks like the field tensor in electrodynamics being just the curl of 
A,,. Then eq. (2.34) can be written as 


— 58 F uh {a,(A, XA,) +A, XE}. (2.42) 
This is just like electrodynamics. Each field is produced by a source; the source 
is isospin density (analogous to electric charge in electrodynamics) which here 
is a sum of the contributions from the matter and the field itself. The disad- 
vantage of looking at it in this way is that we have lost the gauge invariance. 
It’s strange but it’s true that the amount of isospin density in the field depends 
upon the gauge -- it’s not a gauge invariant quantity. This is analogous to the 
way some people like to do gravitation. 

The gravitational field equations (cf. (2.34)) are 


Cit: (2.43) 


where 7,,,, is the energy-momentum tensor of the matter and G,,,, is the 
Einstein tensor. Algebraically 


D,Gy, =0, (2.44) 
where D, is some covariant derivative, and hence we have 
Dil 0, (2.45) 


analogous to (2.37). However, as in (2.42) we can rewrite (2.43) as 
Cig? Tt Riis (2.46) 


where Guy is linear in g,,, and K,,,, contains only terms quadratic and higher 

in g,,,- This is now the equation for a spin 2 particle where K,,,, is the energy- 
momentum density in the gravitational field, and we say that the gravitational 
field is produced by all energy, the energy of matter and the energy of the field 
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itself; that’s why it’s non-linear. However if we make a generalised co-ordinate 
transformation, Tig + Kup is not a real tensor; this is a famous problem, there 
is no real way to define the total energy-momentum tensor of the universe. 


3. A geometrical look at gauge invariance 


At each point in space-time imagine a frame defining axes in SU(n) in some 
sense continuous (nearby frames nearly the same), but otherwise arbitrary. 
Then physically we might hope to define what we mean by the frames at x 
and x’ are in the same direction. That is, imagine that we take an up particle 
(e.g. a proton in SU(2)) at x and send it over to x’ so the guy at x’ can see what 
we call up. If there were no external influence, which could rotate isospin, 
acting in the space between x and x’, we might hope to define and check that 
everyone is using the same frame and can expect that one choice of frames at 
all x to be best in the sense of making the physics equations simplest (e.g. all 
parallel). 

But under an influence, we must correct for the influence. lf the influence 
is not universal (e.g. acts on protons but not on pions), we can compare frames 
using a particle which is affected least, or by looking at the different ways in 
which different particles are affected. If a universal influence acts which ro- 
tates the axis of isospin of every particle to the same degree, it clearly has no 
effect locally; but if the rotation varies from point to point and from time to 
time, than under some circumstances there may be an effect. 

Now suppose there is such a universal influence and let us try to compare 
a frame at b to a frame at a by sending a particle from a to b. Then the frame 
at a “‘carried”’ to point b might find (the frame as it is carried is of course turned 
by the universal influence), in general, that it is not lined up with the frame 
originally chosen at b, but requires an additional rotation R(b < a). Thus 
R(b <a) tells us how much the frame b differs from the frame a when a is 
carried over, through space-time, to make the comparison. We could, of course, 
get a set of “best” frames so that all R = 1 by choosing at 5 the frame we get 
by carrying our a frame to each space-time point. 


R(c«a)=R(c<~b) R(b <a), (3.1) 


ie. unless R(c < a) is independent of the route by which a is carried, in which 
case we do not have an interesting physical theory at all: R can be made equal 
to the identity by a proper choice of a universal set of frames. 

The interesting theory arises if this is not the case, i.e. if the rotationR(?7") 
depends on the path & in space connecting the points. We now study this gen- 
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eral case — the geometry of a field of frames with a method of parallel displace- 
ment, or comparison along geomatrical paths. (If the frames were Lorentz 
frames arbitrarily displaced, the theory is differential geometry and the physi- 
cal theory is Einstein’s General Relativity.) 


b 


lag 


a 


Of course ROS) which compares frames at b and a does depend on the origi- 
nal frame choice at each point. If the frames at each point were rotated by 
P(a) then the rotation between b and a would now be 


RZ ‘) = P(b)R (74)r@ (3.2) 


Physics should not depend on this choice, so we look for invariant properties 
of R, this is most easily done geometrically. 

For a closed path “9, a <a, we might get a resulting rotation RZ) de- 
pendent on Zo. This determines a “‘strain’” or physical effect independent of 
the choice of frames and thus invariant. To analyse these things most easily, 
we work with infinitesimal displacements and closed circuits (as any finite 
closed circuit can be represented as an area integral of infinitesimal closed cir- 
cuits, in the manner familiar in the usual demonstration of Stokes’s theorem). 

Thus consider b to be separated from a by an infinitesimal coordinate dis- 
placement Ax,,. Then R is nearly 1, the difference being of order Ax,,. Hence 
we can write, to first order in Ax, 


R(at+ Ax <a)=1— in, )Ax, , (3.3) 


where un is a vector field, in Minkowski space, depending on x, the location 
of point a, and is an operator in isospace. The transformation property ofn, un- 
der a rotation P of the frames is given (using (3.2)) by 


1 — in Ax, = P(x + Ax)(1— in, Ax, )P71(x) : 
Putting 


P(x + Ax) = P(x) + Se ox 


iL 
we get 


1h = Poe)myP NO) + GPa) (3.4) 
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(we call this a gauge transformation of 7). 
What happens if we go round a small square? 


ox pi 
(3.5) 

as amr! ‘i 

Calculating to second order we obtain (to be correct to 2nd order we should 
expand each R to second order beyond (3.3), but it is readily seen that these 
terms will cancel in this order in going up and down the sides of the square, 
i.e. such terms in the first bracket will cancel with terms in the third bracket, 
since to second order they are opposite) 


R= [ tin, (x ox, + | | + in, (x +x + 3x] Ax, + | 
x [ _ ing (x + Ax + 22)ox, + | E -~ in, (x + Sax, + | 


era On ON tiny, my) } dx, Ax, (3.6) 


=1+ iM Ox, AX, : (3.7) 
where we have defined 


UGE = oun, aa Oy + i[n,, ny) > (3.8) 


which is associated with an area 6x, Ax, and is an antisymmetric tensor opera- 
tor of the second rank. “M,,,, is the physically interesting thing associated with 
the connection n,, which takes us from place to place. 

Suppose °% is any tensor operator. We wish to know how it changes from 
place to place. It will not do simply to take (x + Ax) — % (x) since we can- 
not compare objects at a distance because of the effects of the universal in- 
fluence. We must take account of the rotation of the frame by transporting 
“B(x + Ax) back to x before making a comparison. Hence the total change 
in %B is 


[1 + in, Ax, J B(x + Ax)[1 — in, Ax,] — B(x) 


= [ + in,Ay| (as (x) + oe ax,] f = in, ds, | — B(x) 


-[2me + tn, B00 Ax, - (3.9) 
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This enables us to define a covariant derivative on any tensor “B by 
DB = 0,8 +i[n,,B]. (3.10) 


e will now deduce an interesting geometrical identity satisfied by “N,,,,. 


We wish to calculate the difference in circulation between the top and bot- 
tom faces of the cube. To get this difference it will not do simply to take 
ML, (x + dx) - MH, Cx), because to get back to O we must go from O to P 
(a factor 1— in,dx,), then around the top (a factor 1 + be (Gre Og dx)5x,4x,), 
and ‘hen back down to O (a factor | + in,dx,); hence we want to compare 


[i + ingdx,] [1 + iM, + dx,)5x,Ax,] [1 — in,dx,] 
with 
1+ iW, (x)dx, Ax, . 


The difference between these two terms is 


Ot : 
Suir + i[n MM yy) Wav, > G-11) 
where 

dV yg = 8x, Ax dx, . 


Notice that the term in the bracket is just the covariant derivative of ‘M,,,; 
this is not surprising as all we have done is to compare are (cf. (3.5)) at O 
and P. 


By comparing “M%(,,, on opposite pairs of faces, and noting that the net 
effect of going round the sum of the following 3 paths is zero, 
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(A+B>C>G>F>B>A>E>H>D=A) 
+(A>B>F>E>A>D-H>G?C ~D-A), 


we get 

DsM ,, +D,M,, + D,M,,, = 0. (3.12) 
By taking 8 round a small closed loop, or by using (3.10), we get 

D,D,B ~ D,D,,B = i{ My, B) , (3.13) 
and hence 

DD Ue, 0% (3.14) 


Now we will associate this with what we did before with Yang—Mills theory. 
Here each R is a rotation in isospace and so can be written in the form 


R=exp(-ia:T) (3.15) 


where the 7 are the generators of some representation of the SU(2) Lie algebra, 
and « is a vector depending on the rotation it is desired to represent. 

For the infinitesimal rotation (3.3), « is also an infinitesimal of first order 
in Ax,,, say a= -A,, Ax, so 

Re Aha Ay eo OX 3 (3.16) 


Comparing this with (3.3) we see that 


Ry Agi ts (3.17) 
Substituting this into (3.8) we obtain 

Meg SE er (3.18) 
where 

E,, = 3,4, —2,A, +4, XA, (3.19) 


and we see that is the same as the £,,,, which we obtained in (2.24). The ques- 
tion now is — can we understand any of the Yang—Mills field equations 
(2.26)--(2.37) geometrically? The answer is for most of the equations, no. In 
particular, why did we put the combination —(1/4g")E,,,° E\,, in the Lagran- 
gian? Physics tells us, not geometry. All we have discussed are the qualitative 
features of a classical field, and the response of particles to it (they rotate the 
axes aS they move through it), but how the field itself has energy and behaves 
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dynamically we haven’t determined. But there are a few things we can work 
out. Using (3.12) and (3.18) we have 


Dh Bite? Dep 0} (3.20) 
And using (3.13) and (3.14), we have 

D,,D,o — D,D,o=E,, X, (3.21) 
where @ is any isovector and 

DDE 70 (3.22) 


(cf. (2.35) and (2.36)). 
The analogue of (2.34) would be 


D Mg Hk, . (3.23) 


What is DN, geometrically? If I knew an easy way to describe this 
geometrically then we could state this equation as: D, “Ml ,,,, is the total isospin 
in a small volume. Unfortunately I haven’t worked this out, and therefore I 
cannot describe the full Yang—Mills classical theory in an elementary way. (I 
am looking for a law like that in gravity, which says that the excess of the 
proper radius of a small 3 dimensional sphere over the radius calculated from 
the area, \/(area/4z), is proportional to the mass inside the sphere — which is, 
assuming Lorentz invariance, a complete statement of the Einstein law R,,, 
gu j= Ts, , 

We have ‘said that the A field represents a universal turning of the axes of 
a diffusing particle, and now we should check that the equation of motion of 
the matter (ig — A -t)y = my, for example, implies that indeed it does. To 
make it is easy, we first do it with the Schroedinger equation and electrody- 
namics but you will see that the method of proof is readily extendible to other 
cases. The free particle Schroedinger equation is 

V2y_ ay 
a lap? (3.24) 


Solving this equation, we find that a particle propagates as follows. Suppose 
it is confined at x, at time fr, (the wave-function is a delta-function). Then the 
wave-function at time f, and position x, is 


al ~ “ ~3/2 im(x7 — it 
m exp] (ty — 14) 


which we call Kg (2,1), this being the function which describes how the parti- 
cle diffuses outwards in time. Now consider a possible trajectory for the particle 


(3.25) 
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pr ae 
A Ax 


The amplitude for the particle to go from A to B can be found by multiplying 
the amplitudes for it to go along all the infinitesimal sections of the path. 
Now add an external field to the Schroedinger equation 


lige , _ oy 
am GY- AP y+ VW Hise (3.26) 


Consider an infinitesimal distance Ax. Over this distance, A and V may be 
treated as constants and are therefore expressible as the gradient of a poten- 
tial x 


A=Vx, V = odx/dr, (3.27) 
where 
X=AXH+VE. 


By a gauge transformation we see that the wave-function y' =e~'Xy is a solu- 
tion of (3.26) if p is a solution of (3.24), and hence we may write the ampli- 
tude for propagation over a short distance and time Ax in the presence of the 


field as 
K 4(x + Ax, x) = exp {ix(x + Ax)}exp{—ix(x)} Kg(x + Ax, x) 


= exp(iA , Ax, )Kg(x + Ax, x), (3.28) 
where 
Ao aes V, A; =A . 
Iterating this over a continuous path, we get 
exp(i fA, dx,) X (Amplitude for process without the A field) . (3.29) 
In the case of Yang—Mills theory, this generalises to 
exp(i [A, * Tdx,,) X (Amplitude for process without Y—M field), (3.30) 


where we must order the operator 7 along the path 
For infintesimal paths this reduces to 
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[l + i(A,, *T)Ax,,] X(Amplitude for process without YM field), (3.31) 


and thus we see that the expression (3.16) has emerged naturally. 


4. A qualitative critique of QCD 


One of the possible applications of Yang—Mills theory is to Quantum 
Chromodynamics. The question to which we wish to address ourselves in this 
chapter is whether or not this theory has a real chance of being right. Ordi- 
narily, when the right theory is found, it isn’t long before we can calculate 
consequences and check that it agrees with everything relevant that is known. 
For example, the whole subject of electrostatics was in complete confusion 
until the Coulomb law was discovered; before then people were rushing around 
in complete chaos and then suddenly they were calculating the capacity of 
elliptical condensors etc. .... Similarly, before Schroedinger’s equation, there 
was a lot of pulling and hauling on ideas which were inconsistent, and sudden- 
ly, as soon as the equation was discovered, there was a tremendous tumbling 
out of results which showed how everything worked. Therefore it’s expected 
(at least by an old fogey like myself) than when the correct theory is found, 
lots of results will tumble out which will agree with experiment. Now QCD is 
proposed as a theory which is supposed to be the correct theory of strong 
interactions; it’s been around for a few years now and we don’t have any quan- 
titative results. At the moment we cannot look at the theory quantitatively 
(due possibly to technical difficulties in its interpretation) so we will look at 
it qualitatively to try and decide whether it is useful or not. 


4.1, Forces between the quarks 


The most characteristic thing about the quark bound state which have been 
seen is that they are colour singlets. Coloured states have not been seen and we 
must conclude that either their mass is infinite or is out of the reach of present 
experiments. 

In this theory the three quarks in a baryon form an antisymmetric state 
with respect to colour: 


TR {IAM BIIC) +|B)IC)IA) + IC) A)1B) 


— | BI A)IC) — [A1C)| B) - |C1 BAD}. (4.1) 
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The first thing to look at is when we put three quarks together, can their 
energy be lower in some other state than in the singlet state? Similarly putting 
@ quark and an anti-quark together, is the colourless (singlet) state lower than 
any other state? At the level we will work, we will not attempt to calculate the 
energies correctly, we merely wish to see whether we can get their order right. In 
QED we know that the force between two static particles is 


anc ae (4.2) 
r 
e e 
Similarly the force between two static quarks due to single gluon is 
anna ~ EQY) i) ) (aj) 
1 2 r 
a; x, (4.3) 


where A; are the SU(3) matrices. 
One can see this as follows. For a static source J, = J,, = J, = 0, and J, = p, 


the density of colour charge, we find a solution of the form A, = A, =A, =0, 
A, #0 where 


VA, =p; (4.4) 


(from 2.34). This is just as in electrostatics except that here we have three iso- 
spin components. We could look at this problem mathematically (by fiddling 
around with the \’s); however we wish to take a simpler view. The gluons 
which couple to the colours of the quarks, can be represented as 


AB, BA, Ac, Ca, BC, CB, ~s(BB-CC), Tg(2Aa-BB-Cc). 
(4.5) 

The last two gluons are the non colour changing states orthogonal to the 
singlet (1/\/3)(AA + BB + CC) which is omitted for reasons discussed earlier. 
This symbolism really tells us what happens to the colour of a quark when it 
emits or absorbs a gluon. That is, an AB gluon can be absorbed by an A quark 
turning it into B, with amplitude 1. (Annihilate the A and create a B instead.) 
We now calculate the relative strengths with which various quarks couple. Con- 
sider this vertex; ar A quark emits a BA gluon changing to B. 


B 


Sol 


: (4.6) 
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This BA gluon can be absorbed by a B quark (turning it to A) but cannot be 
absorbed by an A or C (it can by an A antiquark however). Therefore, if we 
wish to look at the A—A quark force, this vertex cannot contribute because 
this gluon cannot be absorbed by an A quark, ie. 


B on B 
en 
A A 


does not go. 


The only contribution to the A—A force is that due to the colourless gluon 
exchange, viz. 


1 Pe ane = 
A zz (2AA-BB-CC), 


6 
ae (4.7) 


A A 


The interaction energy for this process is +2/\/6 (+2/./6) = + 3, which is posi- 
tive, so they repel just as in electrostatics (like charges repel). 
Now what about the A—B force? Here there are two allowed diagrams. 


Bile ey tae, Bae 
AVE (2AA-BB_-CC) B ) 1 
SS interaction energy = 6 (| ae (4.8) 


A B 


and the exchange diagram 


in 3 interaction energy = (+1)(+1)=+1. (4.9) 


4.8 and 4.9 must be combined to give the total interaction energy which gives 
+3. 

We could de the same for all the other colour combinations, but we’d be 
wasting our time as we know that the interaction is symmetric with respect to 
colour. The only cases we have to worry about are when the colours of the 
quarks are the same or different. We can summarize the results as follows. In- 
troduce a colour exchange operator P which interchanges a pair of quarks, it 
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has eigenvalues +1 or 1 corresponding to whether the quarks are in a sym- 
metric or antisymmetric colour state. In the case when the quarks are different 
(cf. (4.8), (4.9), the interaction energy can be written 


(p-3), (4.10) 


where p is the eigenvalue of P. The lovely thing about this formula is that, if 

we apply it to the case where the colours are the same (cf. (4.7), p = 1 in this 
case), we get the right answer 1 — 3 = 3. Thus we have shown that D(A,)1(A,)? = 
P — 5. This is analogous to the Dirac formula for the interaction between two 
spins (e.g. two electrons in an atom): 


Pexch ~3 = 25 (0/)(o,P (4.11) 


(the —4 becomes —1/n for SU(#). 

In order to generalise formula (4.10) to states of more than two quarks, we 
merely sum it over all possible pairs of quarks. 

We shall now calculate the energy of various quark states using (4.10). Each 
quark will have some self-energy, we don’t know what this is, but to make it 
easier to see what’s going on, we will take +} for each quark — the qualitative 
results do not depend on this choice (because we will always compare the ener- 
gies of states of the same total number of quarks). 

We will symbolise the quark states as follows: draw a series of boxes with 
one box for each quark in the state, for example a possible 6 quark state is 


levels 


— 


colours 


| (4.12) 


This represents a quark configuration in which the wave-function is sym- 
metric with respect to the interchange of any pair of quarks in the same row, 
and antisymmetric with respect to the interchange of any pair of quarks in the 
same column, e.g. for a two quark state, {_]_ ]represents the symmetric state 


and Hrepresents the antisymmetric state. We can use these (Young) diagrams 


to calculate the energy of a bound state using (4.10). We will work out one of 
these diagrams in detail for a three quark state which is symmetric under ex- 
change of one pair of quarks and antisymmetric under exchange of another 
pair: 
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HH (4.13) 


The centribution from the exchange is 


HD ee 1) 


TOW column * 


Therefore the total interaction energy is (+1) + (—1) — 3(3) = — 1. When we 


add +4 for each quark we obtain +3. The following table summarises the results 
for various states. 


No. of Diagram ; Interaction energy State energy 
quarks (+3 added per 
quark) 
1 L] 0 +5 
2 mS +1 ~3 = +3 +9 
- Sphied _ 
3 [EL]  3en-3@)=+2 +6 
La #14 (-) -34)=-1 +3 
H 3(-1) - 3) = -4 0 
4 cP +1+3(-1) - 6G)= 4 3 
5 cA +1) +4(—1) — 104) = - Ff +5 
6 7 3(+1) + 6(—1) ~ 15) = -8 0 


Notice that[—, which is totally antisymmetric with respect to colour and 


mee 


which we identify with a baryon, has the lowest energy. Further has the 
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saine energy aS E + [| i.e. a single quark should be only weakly bound toa 


proton. Similarly has the same energy as Es em our approximation, 


zero) and it is impossible to say, in this poor approximation, whether or not 
such a bound state of two baryons will exist. 
Now let us try the same thing for the mesons. Consider 


Ws a et gas Aree 
A Ve (2A4-BB-CC) A 


2g ef ee NG. 2 
Senn interaction energy = +3-(-4,)- 3° (4.14) 


A AK 


The minus sign appears at the antiparticle vertex because the theory is a 
vector theory; this is just like electricity where the antiparticle has the oppo- 
site charge to the particle. Another possible term in the A—A interaction is 
that in which the quarks change their states 


B = B 
BA 
Pe interaction energy = +1 (—1) =—1 (4.15) 
A A 
Similarly 
c = ¢ 
CA 

> has the same energy as (4.15), —1 . (4.16) 
A A 


Now we can figure out the energy of the meson state (1/\/3)(|A>| A) + |B)|B) 
+|C)|C)) (singlet) ie. we need to consider the coupling of this state to itself. 
Since this state is symmetric in A, B, C we get 3 X 1/V/3| ADA) X 1//3 (IAA) 
+ |B)|B) + |C)1C)) which is simply the sum of (4.14)—(4.16). Therefore the 
interaction energy is —§. Adding +3 for each quark, we find the meson energy 
to be 0. (The mesons have the same energy scale as baryons, with this +3 choice 
for each quark.) 
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Finally, we examine the state | A)|B), a coloured meson; there is only one 
possible diagram 


i 
— (2AaA-BB-TC 
, Ve - 
enn interaction energy = + cs fe = ee (4.17) 
"V6 V6) *3° : 
A B 


Adding the quark self-energy we get a coloured meson energy of +3, which 
is much higher than for a colourless meson. We could ask why 


A.B.C A.B.C 


does not contribute to the energy of the colourless meson. The answer is that 
the s channel gluon must be colourless; no such gluon exists — it was elimi- 
nated earlier. 


4.2. Infra-red behaviour 


We have indicated that ar~! potential exists between two quarks in this 
theory. We know that this cannot really be correct, since the force between 
two quarks is known not to be long range — it is not easy to knock the quarks 
apart inside a proton (c.f. the case with which we can knock electrons out of 
atoms). This problem cannot be argued away by saying that the charges in 
QCD are stronger than in QED — with enough energy, we should be able to 
pull them apart. The long distance (infra-red) forces have to be modified in 
order to agree with experiment. Those who believe in QCD believe that this 
will happen when the theory is worked out. I think that the central problem 
in QCD is to see if, qualitatively, the forces are so modified. In some models, 
e.g. the lattice theory (see [12]), they are, but we are not sure if this an arti- 
fact of the models or not. It is true that when we go to higher order perturba- 
tions, we find the force increasing with distance (there are logarithmic correc- 
tions) relative to 1/r2,, but it is unknown if this change is sufficient. 
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4.3. Dependence of masses on flavours 


4.3.1, Isospin dependence 


The proton and the A are both supposed to be made out of 3 quarks in a 
totally antisymmetric colour state. 


Mass (MeV) Isospin Spin 
938 5 , 
A 1236 3 3 


If QCD has forces depending only colour, how could there be a difference 
between these two masses depending on the isospin? In addition to the isospin 
being different, the spins are also different and we know, for example, that 
the forces are spin dependent and different for different spin states in QED; so 
there is no reason why this cannot be the case in QCD. The difference in the 
P and A masses, therefore, may simply be due to the fact that their spins are 
different. Then we say — look through the Rosenfeld tables to find particles 
with different isospins and masses, but the same spins and for which we expect 
the same space. Then QCD could be in trouble. However we cannot find any 
for the following reason. 

The wave-function is antisymmetric with respect to interchange of two 
quarks (Fermi statistics). This total exchange is equivalent to an exchange of 
space, spin, flavour and colour. Since the state is antisymmetric with respect 
to colour exchange (colour singlet), the flavour symmetry must equal the 
space symmetry X the spin symmetry; hence the flavour symmetry properties 
of a state are completely determined by its space and spin symmetry proper- 
ties. But colour forces depend on the space and spin configurations and there- 
fore can apparently depend on the isospin symmetry. In other words, we can 
find no wiy to verify the proposition that the forces are independent of fla- 
vour. This, by the way, should be noted because in the early days of hadron 
theory the forces had explicit isospin dependence, e.g. there were interaction 
terms like V(o- Y)W, for a proton coupled to a pion. In QCD we cannot have 
any such directly isospin dependent terms, but it doesn’t matter as we have 
seen that the masses can depend indirectly on isospin. 

A very rough empirical formula exists [13] which summarises the mass 
splitting between baryon multiplets. It says that there is a contribution of 
—0.53 (GeV)? to the (mass)? for every pair of quarks which are both sym- 
metric in space and antisymmetric in spin. Mathematically this is 
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VP ay ee Nd: ¥ Page ones 
AM2 = -0.53 (GeV)? 2a ( = =) = space ). (4.18) 
pairs 7 


Notice that the masses are lower if the spacial state is symmetric as opposed 
to antisymmetric, in an antisymmetric state, it is impossible for the quarks to 
be at the same point; but in the symmetric state this is allowed, so that the 
qualitative features of (4.18) may be summarized by saying that the correc- 
tion force is short range. The spin part says that antiparallel spin states are 
lower which is the right sign for the spin interaction of attracting particles due 
to a vector potential. 


4.3.2. Dependence on the quark masses 

We know that the K mass does not equal the 7 mass, although the K and 
the 7 both have the same spin-parity. How can we explain this? If all the forces 
are independent of flavour, then the masses should be exactly the same. It is 
therefore a failure of the simplest possible picture that SU(3) is broken. But 
we do not need to destroy QCD, we need only complicate it by supposing 
that the s quark has a higher mass than the u and d quarks (but keeping 
the interaction independent of flavour). No-one knows where this extra mass 
comes from, we have left for the people of the future the problem of why the 
masses are different. Some people think that the masses of the u and d quarks 
are equal and that all the mass differences in isospin multiplets are due to elec- 
trodynamics (e.g. proton—neutron), there are some technical difficulties in 
signs and magnitudes in this approach and it would help if we could say that 
the d quark is slightly heavier than the u quark for the same intrinsic (unknown) 
reason that the s quark is heavier than the u and d quarks. Another flavour 
now seems to have been found and maybe there are more; this new quark 
(usually called charm) must have a higher mass than the other three. We 
mustn’t forget that it is strange that we have to put in mass differences which 
are of the same order of magnitude as the bound state masses we are trying to 
explain. If we go to very high momentum transfers where masses are irrelevant, 
the SU(3) (flavour) should become better, | don’t know of any direct demon- 
stration that this is, or is not, the case — it would be nice to think of some ex- 
periment in which this could be tested. 


4.4. Zweig rule 


Consider the following groups of mesons 


GAUGE THEORIES ISI 
—_— on on on 
Mass(GeV) 0.14 0.55 0.96 
ie ee Sa 
Mass(GeV) 0.77 0.78 1.02 : 
Mass(GeV) 131 127 . 152. i - 


These are the non-strange mesons from the 0~, !~ and 2* nonets and one 
might expect some similarity between these groups. Now it is known that the 
¢ (e.g. in its decay, KK dominates) is an |ss) state. Similarly the w (37 decay 
mode dominates) is 1/\/2(|uu) + |dd)) and the p is 1/\/2(luu) — |dd)) and 
they are very nearly degenerate. But the 7 and the 7’ do not have this pattern 
Approximately, 


n=%(luu) +/dd)—V2Iss>) and’ = 3 (lua) + |dd) + V21s5)) 


agrees with experiment. Note that this combination is very different from the 
w, @ case. Why? We would think that, in a state which contains [ss ), the s and 
’s could annihilate and then turn back into s and $, but also into u and U or d 
and d; therefore an |sS) state should become a mixture of all 3 quark types, 
and this is presumbly what is happening in the n, n' system. Why does this not 
happen in the w, ¢ case (and also does not happen in the f, f' case)? This is a 
mystery. This mystery is summed up in the Zweig rule, an ad hoc rule which 
says that this annihilation process is inhibited. 
One possible attitude is as follows 


oa ee 


With an O™ state, we need 2 gluons to connect both sides of this diagram. 
At first sight, we would think that with a 1~ state, we could connect with 1 
gluon since the gluon has quantum numbers 17 (cf. photon); but this would 
require a colour singlet gluon and we have no such object; so we need a mini- 
muin of 3 gluons. lf we could assume that the gluons were weakly coupled to 
the quarks, we might try to argue that the mixing in the w, ¢ case is less than 
in the n, n' case, but it’s a little hard to get g> less than g2 unless g (the cou- 
pling constant) itself is very small. Furthermore, if we look at the 2* mesons, 
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we could make the connection in the diagram with only 2 gluons, so it should 
look rather like 7, 7’ system; but it doesn’t (here the suggestion was made that 
the higher angular momentum 2* makes it harder for the gluons to get to- 
gether to annihilate). 

There is also another attitude, which is to suppose that, for some reason, 
when momentum transfers and energies become large, the coupling becomes 
small, then we can say the mixing is less in the 2* case than in the O~ case be- 
cause the energy is higher. In order for this work, the rate of change of coupling 
with mass scale must be large. 

This is not a satisfactory situation — it looks suspicious, as there is some 
feature of QCD that we do not understand. If we have a way of calculating 
with QCD (e.g. on a lattice) and if we wish to concentrate on something which 
will tell us if the theory is wrong, this seems to me to be an ideal place. 


4.5, Large transverse momentum in hadronic collisions 


Consider the process 
p +p - hadron (large p,) + X . (4.19) 


The average », of a produced hadron at high energies is of order 350 MeV. 
Ina field theory, it is not easy to understand why there isn’t a reasonable 
amount of larger p,. Let us look specifically at particles produced at very 
large p,. 

In our picture the proton is made of a bunch of quarks, so we might expect 
that a large p, particle would be produced by quarks scattering via gluon ex- 
change. 


proton proton 


ee 


equals 


(4.20) 


quarks  ———— —<———  quorks 
ee —_—— 


If the ratio x, = p,/p is kept constant, the cross-section at large p, should 
go as Pr (This follows merely by dimensional arguments and this result would 
be obtained for any gluon diagram.) Experimentally, it is more like pares 
This is very disturbing. Is this process operating or not? A possibility is that 
it really does occur, but when we put in the correct couplings and allow for 
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the fact that the coupling constant will fall as p, increases, the process (4.20) 
is masked by some other process which falls like pee We niust then assume 
that this other process hasn’t yet fallen enough to let us see the mechanism 
(4.20). However the cross-section is already pretty sinall at 400 GeV and still 
seems to be falling like ae it is therefore up to the people who propose this 
explanation to give some energy above which they believe the process (4.20) 
will take over; and to explain what mechanism is responsible for the present 
trend. (One possible inechanism is described by the constituent interchange 
model, which gives a cross-section falling like po But in any event there is 
a challenge to see what process QCD predicts that is so large as to dominate 
over all the experimental range so far investigated and which behaves like pee 
Any theory of this process has to explain much data such as charge ratios cor- 
relations, etc... . Details such as the fact that as x, rises (toward 0.6) the ratio 
nm /m~ rises to more than 2, have to be explained. 


4.6. Partially conserved axial current and vector meson dominance 


Certain hadrons have special properties and it is not clear where these prop- 
erties come from. One of these hadrons is the pion, which has the property of 
PCAC associated with it; another is the p meson which participates in VMD. 

Both PCAC and VMD were discovered when it was thought that the parti- 
cles were themselves fundamental fields. Why bound states of quarks (as QCD 
assumes the 7 and p are) should behave as if they were fundamental fields is 
a puzzle. I don’t know whether this is a serious problem for QCD or not. 


5. Spontaneously broken symmetries and the Higgs mechanism 


We will try to show why it is necessary to use a Yang—Mills theory in weak 
interactions by considering the difficulties which appear in a more phenomeno- 
logical approach. We will discuss some relevent points from weak interaction 
theory although we do not propose to give a review of it. Consider the decay 
of a” ; one interpretation df this decay is the following diagram. 


2 ‘: (5.1) 


where the weak interaction is mediated by a massive charged spin one boson 
W~.(The W~ must be massive as the weak interaction is short range.) The 
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couplings at the two vertices are assumed to be V—A and equal (coupling con- 
stant f). The amplitude for this decay is 


2 
anf, 
q —My, 


where q is the momentum transfer between the u~ and the e~ . Forg <My,, 
this reduces to 


fim, . (5.3) 


In this limit the interaction looks like the four point Fermi interaction 


(5.2) 


v 


mM e 
G/V2 Amplitude = (GA/2) (5.4) 
e v 


e 
ie. comparing (5.3) with (5.4) we get 
fim = GIV/2. (5.5) 


Let us write a Lagrangian for the W~ and its anti-particle, the W* , with a 
mass; we do this by analogy with QED for a massive photon: 


= —4(0,W, - a,W,)? + 7M?w, W,, + matter terms . (5.6) 


Varying the action with respect to W,, to obtain the equations of motion, we 
get 


0, (0, W, —,W,)+MPW, =S, , (5.7) 


where S,, is the current generated by the matter fields in the Lagrangian. S,, is 
not conserved: one contribution to Su (the pv W vertex in (5.1)) is 


VO, yull = ¥5) WH) - (5.8) 


This has non-zero divergence. It would have zero divergence if the ys were 
absent and the masses of z~ and v,, were equal, but this non-zero divergence 
is all right since the divergence of (5.7) is 


2 = 
MoO WF OS. (5.9) 
So we see that the presence of the mass term saves us from a possible incon- 
sistency. 


We derive the propagator for the W,, as follows. Eq. (5.7) can be re-written 
using (5.9) as 
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1 
a W,+M Ww, Sas tn cH v5 0) - (5.10) 
in momentum space, this is 
l ( ws) 
We as pe gee (5.11) 


Therefore, given the sources S,,, we can calcitlate the field W,, using the propa- 
gator 
2 
By — ky kL/M*) 


Sp (5.12) 


This tells us how a field W,, propagates from one interaction to the next. Using 
this, we can construct diagrams for the theory. There are indices on the propa- 
gator because W is a vector and therefore has various polarisation states. We 
can see that a free W,, has three polarisation states, as required of a vector par- 
ticle, and not four as we might guess at first sight: take (5.7) with S,, = 0; this 
describes the propagation of non interacting W’s. Substitute a free particle solu- 
tion 


Wee cele es (5.13) 
where e,, is the W polarisation vector. This gives 

—k(kye, - kye,) + Me, =0. (5.14) 
Multiplying by k,, we get 

—k2(k+e) + k2(k +e) + M2(k-e)=0. (5.15) 
Hence 

k-e=0 (M#0), (5.16) 


i.e. the polarisation vector e,, is orthogonal to k,, and so has only three de- 
grees of freedom. Of course, when the W’s are off mass shell, there are four 
polarisation states. Similarly, in massless QED, the photon has two polarisa- 
tion states when it is on its mass shell, and three when it’s off (the extra degree 
of freedom represents the freedom to make gauge transformations in QED 
which is lost if a mass term is added as in (5.6)). 

We attempt to calculate with this massive W theory. Everything works well 
as long as we do not have any diagrams with closed loops i.e. as long as we 
only calculate tree diagrams: 


690 


156 R.P. FEYNMAN 


ee a 
a (5.17) 


Another way to say this is that the theory works perfectly at the classical level. 
At the quantum level, we run into difficulties with closed loops 


p= (5.18) 


The reason is that we must integrate over the momentum k running round the 
closed loop. At large k, the propagator, 5.12 ~ kk [k? so the propagator does 
not assist the convergence of the integral; the diagram diverges. Unfortunately 
we cannot remove these divergences as in QED, because as we go to more loops 
the divergences become more and more severe: (i.e. it is not renormalizable), 
e.g. 


ae diverges . diverges 
id like K > lke K* (5 19) 


The theory is therefore a disaster quantum mechanically and in order to con- 
struct a workable renormalisable theory of weak interactions we go to a Yang— 
Mills theory with broken symmetry. To do this, we first discuss symmetry 
breaking in simpler cases. | shall take a rather physical view of symmetry break- 
ing and leave the more abstract mathematics to other people. | do this because 
I think that it is useful to have more than one way of looking at a problem; 
the way | shall present it may be unfamiliar, but people may benefit from this 
physical approach. 

We will start with a simple model and gradually increase its complexity. 
First take a real scalar field only; the Lagrangian is 
r 


<¢* , (5.20) 


2 
£=3(0,9 Fe -5 


where we have included a ¢4 interaction term. The theory has discrete symme- 
try ¢ > —¢. Normally the vacuum expectation value of @((@>) is zero; however 
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Fig. 5.1. 


this is not always the case as we must choose (@) to minimise the energy. 
In fig. 5.1, we have plotted the classical potential 


V(9) = ¢ +98 (5.21) 
for the two cases (a) u2 > 0 and (b) u2 <0. We look for the state of minimum 
energy (the vacuum) which we get by minimising V(¢): 

0V(¢)/0¢ =0, (5.22) 
ie. @=Oorg=+tV—p2/2. 


From fig. 5.1, we see that 
for p2>0, (¢)=0 
for p2<0, (¢)=4V—u2/As 40. (5.23) 


Take (@) to be + v, although the choice of sign is arbitrary; however notice 
that once we have chosen (@) = + v, we have broken the symmetry ¢ > —@. 
In case (b), we now perturb about u 


$(x) = ut n(x) (5.24) 
Substituting this in (5.20) we get 


L=—3 (8,0) — dv2n? — dun? Ratha, (5.25) 
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Suppose 7 is small®; to a first approximation only the first two ferms matter; 
these represent a scalar meson with a real mass of /2A v2. (Note that in 5.20 
the ¢ meson had an imaginary mass in the case u2 <0.) The other two 7 terms 
in the Lagrangian represent 7 self couplings viz. 


\ \ 
{ ~ ae 
| NZ 
| x 
\ vor. 5.26 
7 \ oe N ( 26) 
/ \ x ‘ 
/ N aA nt 
/ NN 4 
—dun? 


These couplings would appear when we do perturbation theory with (5.25). 

Why does the whole world have (9) = + vu? Why doesn’t it have (¢) = —v 
somewhere? Suppose that God created the universe in the state (¢) = 0 and 
then the universe discovered that it could lower its energy; where it puts its 
energy is none of my business, but it gets rid of it — gives it back to God or 
something; then under some disturbance the vacuum tries to fall down with 
some parts going to + uv and other parts to —u. But what happens in between? 
It just changes suddenly, but not too suddenly, because to get low energy the 
(0,9) term must not be too large. This extra (8,,¢)? energy. plus the energy 
due to the fact that ¢ is above its minimum potential energy, is stored in the 
boundary between the two regions, so that if we have sone region in which 
{g) = —v surrounded by a region where (@) = + uv, we can lower the energy by 
shrinking the region where (¢) = — v to zero decreasing the surface area and 
hence the energy so that the whole universe is in the same state. 

Secondly, we look at a complex scalar field i.e. @ is a two component field 
with the components Re ¢ and Im @. 

2 


L=5(0,9)"(2,0)— 4 0°39, (5.27) 


with yu <0. 
We minimise the potential 


oyu? 


* r * * _ 
36° 2 o +56 (¢ o)=0 (5.28) 


2 
*, MO ox rN, 2 
V(¢,¢ Jea¢ o+Z(¢ $)*, 
i.e. 6” = 0 (a local maximum) or [¢|? = —w2/A = v2. 


* Large perturbations about the minimum will not interest us. They can only occur is sys- 
tems at very high energy density e.g. in very dense material such as neutron stars. Other- 
wise there are only minor effects duc to barrier penetration. 
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Fig. $.2. 


The minimum fixes |@{2 to be v2, i.e. the minimum is a circle in the ¢ 
plane. 

Since @ (the phase angle of the complex field) is undetermined, we can fix 
the vacuuin to be at any @ we wish. We will take 6 = 0. In order to make per- 
turbations about this vacuum we now make the substitution 


g=elt/r(y +n), (5.29) 


where £ represents perturbations in the @ direction and 7 represents perturba- 
tions in the radial direction. For small £, n this reduces to 


g=utntié, (5.30) 


which is equivalent to doing perturbations about Re ¢ = uv and Im ¢= 0. Sub- 
stituting this into (5.27) we get 


2 
L=5 (0, £)? + (@,n)?] — xn? + cubic and higher order terms. (5.31) 


This tells us that we have a particle 7 with mass \v2/2. This mass is a conse- 
quence of trying to displace the 7 against the restoring forces of the potential 
(the potential is like fig. 5.1(b) in the 7 direction). The & particle has no mass 
-- it is known as a Goldstone boson; it corresponds to displacements around 
the minimum surface (i.e. around the circle in fig. 5.2) where there is no re- 
storing force, since the potential is flat. 

We can ask in this case, what happens if <p) takes different phase values in 
different places A and B. In between, we would like to keep the gradient of 
¢ as small as possible in order to keep the energy as small as possible; but if 
A and B are far apart we can manage this because (g) can continuously vary 
as we go from A and B. As A and B get infinitely far apart, the gradients tend 
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to zero and so the energy stored is zero. We can consider this changing of ¢ in 
the vacuum as a long wavelength excitation; as the wavelength tends to in- 
finity (A and B tend to infinite separation) the energy tends to zero, so this 
excitation in the vacuum will correspond to a zero mass particle. 

The need for such a massless particle appears to be general, and not depen- 
dent on our specific example. If the original Lagrangian has symmetry, this 
symmetry may be broken by the solution of minimum energy in the real world 
(the physical vacuum). But if the symmetry is represented by a continuous 
variable (like a rotation of phase) the “direction” of breaking can be slightly 
different in different places and waves due to perturbations in this direction 
must always be possible. In general, little energy is associated with long wave 
disturbances and we have (after quantising these perturbation waves) particles 
of necessarily zero mass — called Goldstone bosons. There are, however, cases 
where the variation of direction generates a current or charge density of some 
kind with which there are long range forces associated (i.e. r~! potentials). 
Then the large contribution of large volumes to the energy of interaction in 
the long wavelength waves, increases the energy w of the long waves to a 
finite value; the quantised excitations are now of finite energy as the wave- 
number k > 0 and hence of finite mass. (This “mass” generation by long range 
force is familiar in solid state physics where density variations of neutral mole- 
cules give rise to phonons with a dispersion w = C,k, but compressional oscil- 
lations of charges like electrons give rise to plasma waves with a dispersion w = 
Vwp2 + k2 (wp =a constant) due to the long range Coulomb interaction be- 
tween the charge densities.) 

Since later we shall want to use symmetry breaking to explain how mass 
terims arise and since zero mass Goldstone bosons are not found, we whall have 
to add long range interactions (of zero mass Yang—Mills fields) to give them 
mass by this mechanism, called the Higgs—Kibble mechanism [4,5,14]. If there 
is more than one way to vary ¢, while keeping the energy a minimum then there 
will be more than one Goldstone boson. The following example illustrates 
this. In SU(2), take an isovector @; the minimum of the potential V(¢*¢) will 
occur for some non zero magnitude of ¢, but its direction is undetermined and 


0 
we can take () -( 0). In order to look at small perturbations about the mini- 
v 


mum we consider (by analogy with the previous example) 


b= exo(“)exp(—22) ( 0 ). (5.32) 


utn 


where L, and L, are two of the three SU(2) generators. Now the minimum is 
on the surface of a sphere (cf. the circle in fig. 5.2) there are now two inde- 
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pendent directions in which the potential is flat viz. the £; and &,. The n meson 
has a mass corresponding to radial displacements off the sphere, in which direc- 
tion V(¢" ¢) increases. 

In general if the Lagrangian is invariant under a group G but the vacuum 
has a lower symmetry i.e. it is invariant under a group G’ where G’ < G. Then 
there will be massless bosons whose number n = dim G — dim G’. In the SU(2) 
example above, the vacuum is invariant under U(1) so that the number of 
massless mesons is 3-1 = 2. 

We give another example of this phenomenon which has more relevance to 
the physical world. Suppose the u and d quarks have zero mass. Then the 
Lagrangian 


L= pi + Ww idw4 tchirally invariant interaction terms , 5.33 
u u d d y 

is invariant under the transformation 
pre bys y, p> pers (5.34) 


which is called a “chiral transformation” for arbitrary constant b. We can see 
this easily, as y, anticommutes with the other y :natrices. Note however that 
a mass term would not be invariant 


ymy > YmeZib rs y , (5.35) 


but a coupling to a vector potential via cm would be invariant. Thus unless the 
symmetry is broken all the solutions of the Lagrangian must be chirally in- 
variant and so, for example, the proton would have to be massless or parity 
doubled (because the chiral transformation changes the parity of a wavefunc- 
tion). Experimentally the physical world does not have this symmetry. There- 
fore this symmetry must be broken and a Goldstone boson must exist if (5.33) 
is valid. No such massless particle exists but it would have to be a pseudoscalar 
meson, and therefore there is a temptation to associate it with the pion. Un- 
fortunately the pion is not massless — but it is very light and people have there- 
fore concluded the following. It might be that the u and d quarks have a small 
mass and therefore the chiral symmetry is only approximate; this would then 
possibly give a small mass to the pion. If this is the case, we can deduce a num- 
ber of things about the pion couplings; these relations, such as the Goldberger— 
Treiman relation [e.g. 15], known as PCAC, are approximately verified ex- 
perimentally. (We could expect another low mass boson coupled to uu + dd 
ie. a pseudoscalar but with isospin equal to zero. This is not found, people 

feel that the 7’ is too heavy, and how this difficulty can be resolved has always 
been a fascinating problem (the problem of the ninth pseudoscalar boson in 
SU(3)). Lam sorry to find that the length of this course is too short to permit 


695 


696 


162 R.P. FEYNMAN 


me to discuss it along with many other interesting things | had hoped to discuss). 

Let us return to vector fields. As we have so far discussed them, only the 
massless Ones seem to make sense quantum mechanically. These have long 
range forces and we might therefore expect that the presence of these fields 
would eliminate the Goldstone boson when we break the symmetry. This is 
indeed the case as we will now show. Take a Lagrangian describing charged 
scalar particles interacting with photons 

P= 1 : * u2 * Nose 2 21 

= 30, +14,)6°@, -i14,)6—-5 o°O-AG"O?2 FFL Fuy 638) 


v* pv? 


with u2 <0. 

This is just the same as the Lagrangian (5.27) with electromagnetism added, 
We make the same substitution as before viz. ¢ = e}§/¥(u + 7). Again we take 
the minimum of the potential to be vu, giving 


du2 
L=4(8, 8)? + $(8,.n)? ~ ae 245 uta? — vA 0,8 -4F yy Fyy- (5.37) 


It looks as if the A,, field has acquired a mass (see the term 3 y2A2) but there 
is a peculiar term where A , is coupled to 0,,£ but the coupling is really su 
(A, ~ (0,8/ v))2 so we can remove this term by doing a gauge transformation 


g =e itlyg, AL =A, ~ (0, 8/0). (5.38) 
The Lagrangian then becomes 


1 D) uz 4 4 


' r 
On) eae Fs vA"? — dun? aa 
+3422 + vA2n aha tis (5.39) 


(F,,, is invariant under this gauge transformation). In this Lagrangian we again 
have a massive field A, but now the field & has disappeared altogether and 

all the interactions are cubic or higher order. This phenomenon is called the 
Higgs mechanism. (There is no term like A,,0,,£.) These interactions may be 
represented by the following diagrams 


| \ yh gh 
Be. ye . 4 (5.40) 
Tx 


is an A), propagator. 
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One could say: haven’t we lost a degree of freedom (£) between Lagrangians 
(5.36) and (5.39)? No, because the theory is no longer gauge invariant elect ro- 
dynamics — the free vector particle Ai, now has three polarisation states since 
it has a mass, whereas the massless A,, only had two polarisation states. There 
are four dynamical degrees of freedom in each case. We lost the gauge invari- 
ance when we made the explicit gauge transformation to eliminate — and a 
gauge transformation on (5.39) will not leave it invariant: in particular, it will 
bring back a term of the form A,,0,& where & is the gauge parameter. 

We will now give a physical example of this phenomenon. The example ts 
superconductivity and it is discussed in detail in [16]; here we shall merely 
give an outline. Superconductivity has the same properties as described above*™ 
except that it is non-relativistic. In the case of a metal at low temperatures, the 
electrons form (Cooper) pairs of opposite spin in such a way that these pairs 
act as bosons. Let y be the wave function for one of these bosons: it satisfies 
the Schroedinger equation. We consider the case of y interacting with an elec- 
tromagnetic field 


s(iV eA y= Ey, (5.41) 


where £ is a constant and e is the charge of a pair of electrons. Since the y is 
a boson, many pairs of electrons can be in the same state. The electromagnetic 
current is 


i= po lIGY eA wl*y + (v"GY— eI]. S97) 


Suppose that in the absence of A there is no current but that all particles are 
in the same state because of the Bose condensation. This / represents not only 
the probability current of one particle, but when multiplied by e and N/2 (the 
number of particles) represents the physical electric current of the bosons. 
When very many bosons are in the same state, the wave function acquires a 
real physical significance, just as the photon wave function becomes the physi- 
cally real (gauge transformations excepted) A (x, £) of classical mechanics and 
electromagnetism (Maxwell theory) when there are sufficient numbers of 
photons in the same state to make a “real” light wave. Turn on a very small 
field A. To a first approximation, in many cases (those which yield supercon- 
ductivity) the wavefunction is unchanged because so many interacting bosons 
are in the same state; so that the current we get is 


* The Higgs Lagrangian (5.36) in the static case is identical to the Ginzburg--Landau tree 
energy in the theory of type II superconductors [17]. 
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p= Ay ylm. (5.43) 
But y* yp is simply the density of bosons; call this V/2 (N is the density of the 
electrons). The current is 


; N e2A 
Jsuper 5 ~ Gy? (5.44) 
ie. the current in a superconductor is proportional to the vector potential. 
The Maxwell equation 


VeA=#, (5.45) 
becomes 

V7 Ami sscevual ti capereonauctine 4 (5.46) 
ie. 

(v? ~A)A = Jexternal > (5.47) 


where A = Ne2/2m and is called the London constant. This is the equation 
which describes the behaviour of a vector particle with a mass. The theory has 
spontaneously acquired a mass exactly as in the Higgs case. Notice that we have 
also lost the gauge invariance as we did in the Higgs case; we have chosen the 
gauge where the equations describing the physical properties of the theory are 
in simplest form. 

Recall that when we considered the geometrical significance of gauge in- 
variance in sect. 3 we saw how the gauge information could be carried from 
place to place by a particle. In this problem we have merely chosen the gauge 
which is carried by the electrons and it is clear that this is the natural gauge 
for this problem; this is why the equations appear simpler in this gauge. 

We shall now look at spontaneous symmetry breakdown in Yang-Mills 
theories. We look at two cases 

(i) Isospinor scalar particles. We construct the same V(@) as we had earlier 


2 
u 
Vb)=5 976, +7 GF) » (5.48) 
where j= 1,2. Let us assume that at some point in space 
b= vu (|) . (5.49) 


If ¢ is pointing in the three direction of the three dimensional space of SU(2), 
there will no longer be gauge independence in the | and 2 directions since if 
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we rotate about the | or 2 directions ¢ changes, so we expect that two com- 


ponents of the A,, field pick up mass. But what about the 3 direction? When 
we rotate about the 3 direction, a spinor gets multiplied by a phase i.e. 


(7), am 


for a rotation through @. So we do not have any gauge freedom left and there- 
fore all three components of A,, pick up a mass. 
Mathematically 


1 : 
L= gga Fa Ene + 3(d, tit: A)o'(, —it- A)o— Vid). (5.51) 
Put ¢= ee v and look at the mass term for the A’s. This is 
2 
£y21 One Ay(e-a)(t) = Eureat rady, (5.52) 


So we see that all three components of A have acquired the same mass. 
(ii) lsovector scalar particles 


?1 
>= (¢2) . 
$3 
The potential has the same form as (5.48) with 7 = 1, 2, 3. Let the vacuum expec- 


0 
tation value of ¢, (@) = (0) uv, define the 3 direction in isospace. Carrying this 
1 


@ particle around tells us where the 3 directions is everywhere. Hence, we lose 
the freedom to rotate the ¢ field about the 1 and 2 directions and therefore 
A! and A2 will become massive — we have broken the gauge invariance in the 
1 and 2 directions. But if we rotate by @ about the 3 direction the 1 compo- 
nent is multiplied by ei? , the 2 component is multiplied by e~!@ , and the 3 
component is left unchanged i.e. we still have a freedom of gauge rotation 
about the 3 direction. A? will still have zero mass. 

Mathematically, 

1 


2o FEE + 1G, +4 OT FQ, +4, 08] ~V(6) (5.53) 


0 
Substituting ¢ = (3) v we get the mass term for A: 


oa) YP + (A?) (5.54) 
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ie. A! and A? have the same mass and A? is massless, as expected. One might 
ask — why do we use scalar particles to break the symmetry? They are of 
course the simplest to write down. But this “‘superconductivity” effect can be 
generated in many ways (as indeed in real electrodynamics where pairs of fer- 
mions do it), and it is possible that the symmetry breaking if it occurs, in say, 
weak interactions or other places in physics, may have a more complex mecha- 
nism than the Higgs scalar method. But to-day we have a severe restriction on 
the meaningful theories we can write down i.e. that they are relativistic, quan- 
tum mechanical, and renormalisable. If all these restrictions are imposed it 
looks as if only the Higgs method can be formulated at present. 


6. Quantisation 
6.1, Philosophy 


Before we commence a detailed study of the quantisation of a Yang—Mills 
fields, we shall describe the various approaches to quantising a classical theory. 
Fig. 6.1 illustrate a general schema for quantising particle mechanics. 

We begin with a Lagrangian from which we can deduce a classical Hamilto- 
nian, written in terms of g(r) and its conjugate momentum p(v), defined by 
p(t) = d.L/dq(t). There are two routes by which we can quantise the theory. 


LAGRANGIAN 


HAMILTONIAN 


ASSICAL 
ACTION ater pi) cL 
q (t) 
PATH 
INTEGRAL 
(PHASE SPACE) 
q(t) p(t) 
HEISENBERG 
PATH INTEGRAL OPERATORS 
(SPACE) q(t) p (t) QUANTUM 
SCHROEDINGER 


q(t) EQUATION 


PHENOMENA 


Vig. 6.1. 
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From the Hamiltonian we can define operators associated with p(t) and q(t) 
and hence obtain either the Heisenberg or Schiroedinger pictures; from these 
we can deduce results which predict the behaviour of phenomena in the physi- 
cal world. There may be a problem with operator ordering when we go from 
the classical to the quantum theory. The alternative approach is to start with 
the Lagrangian and introduce a path integral which associates a certain ampli- 
tude with each trajectory in space [18]; this enables us to proceed directly to 
calculate the consequences of the quantum theory. (There exists an alternative 
path integral method, not much used these days, devised by DeWitt — Morette 
[19] and Garrod [20] in which one can construct a path integral in phase space 
directly from the Hamiltonian. This resolves many questions of the order 
oft operators in the Hamiltonian. From it the ordinary path integral is easily 
obtained.) All methods lead to the same consequence physically, but each 
method has its advantages and disadvantages and we choose which ever is the 
more convenient for the problem at hand e.g. spin 5 is very awkward to handle 
in path integral approach. 

Consider now the situation when we try to quantise a field theory (fig. 6.2). 


LAGRANGIAN 
ACTION 
® (x,t) 


HAMILTONIAN 
CLASSICAL 


D (x,t), Klx,t) 


PATH PROBLEM 


INTEGRAL OF 
(PHASE SPACE) OPERATOR 
® (x,t), R(x t) ORDERING 


PATH 
INTEGRAL 
SPACE - TIME 


® (x,t) 


PERTURBATION 
THEORY 


HEISENBERG 
OPERATORS 
Olx,t) , Wx, t) 
SCHROEDINGER 
EQUATION 


QUANTUM 


DIAGRAMS 
AND 
RULES 


PHENOMENA 


Fig. 6.2. 
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The classical variables q(t) and p(t) are replaced by $(x, £) and its conju- 
gate momentum n(x, t) defined by m(x, t) = 0L/d¢(x, rt). The situation is simi- 
lar to that in quantising a particle theory and historically people like Heisen- 
berg and Dirac [e.g. 21] followed the Hamiltonian approach. This leads to 
$(x, t) and m(x, t) being operators and again we have a problem with operator 
ordering in the more complicated field theories. There is also a Schroedinger 
approach which people don’t use much these days in which the vacuum has a 
wave function in terms of the co-ordinates i.e. an explicit functional of the 
field function ¢(x). Alternatively we can proceed by the path integral method; 
the path is now a function of $(x, ¢). Because of difficulties in calculating with 
the theory we usually proceed by perturbative methods. To do this we nor- 
mally use diagrams and rules, all of which can easily be deduced from the path 
integral formulation. We can also get these rules from a Heisenberg approach, 
but it is more difficult. Nowadays we know of an obvious and simple-minded 
short cut to get straight from the Lagrangian to the diagram rules. Some peo- 
ple who are not sufficiently acquainted with the theory, think that the rules 
are all there is, and then say that the theory is only defined by a perturbation 
expansion, It is true that we can only calculate things by perturbation theory, 
but this may be only a limitation of the era and in any case there are certain 
things that we can deduce from the Lagrangian without using perturbation 
theory. (For non-relativistic field theories, such as those that arise in solid state 
physics, we are not at all limited to perturbation theory, and many methods 
and solutions for large or intermediate couplings are known.) 

In a field theory there is a problem of renormalisability, because, when we 
calculate diagrams with closed loops we get infinite answers. When we go 
through the Heisenberg approach, the theory is not manifestly Lorentz in- 
variant; but in the path integral approach it is, so for this reason the latter ap- 
proach is to be preferred when we attempt to renormalise the theory. The dif- 
ficulty is to keep the renormalisation process Lorentz invariant when the form 
of the equations is not manifestly invariant. There is difficulty in the path inte- 
gral approach, however, and that is concerned with the inclusion of fermions. 
The path integral method doesn’t work in this case: but when people went 
through the operator approach they found that there were only differences in 
signs when fermions were introduced, and so minor were these differences 
that they forced the path integral formalism to work by introducing Grassmann 
algebras. 

In attempting to quantise Yang—Mills theory, Schwinger [22], after alot 
of hard work, found the Hamiltonian from the Lagrangian, but an attempt to 
proceed with the Hamiltonian approach ran into serious difficulties with the 
ordering of operators and progress in this direction ceased. I took the short cut 
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(in fig. 6.2) [23] and found that there are certain complications in the diagrams 
at the one loop level; but I got round these by introducing a contribution from 
a fictitious particle. The correct rules for this particle were first worked out by 
de Witt [24] and subsequently understood in a more general way by Fadde’ev 
and Popov [26]. However I could only do it for one closed loop: if there were 
two or more closed loops | didn’t know what to do — | sort of half understood 
it. Fadde’ev and Popov straightened this problem out by going via the path in- 
tegral method and discovered that to all orders we have to add a closed loop 
of scalar particle with Fermi statistics for every closed loop with a Yang—Mills 
vector meson, they also went via the Morett path integral approach using the 
Hamiltonian derived by Schwinger. The problem of renormalisability was solved 
by ’t Hooft [25] using his method of dimensional regularisation. 

Yang-Mills theory is often presented as being complicated, but now that 
all this work is done and proofs proved, it is really not much more complicated 
than QED apart from the more complex algebra which: is involved; and a bit 
more care is needed with the gauge invariance. We just add the contribution of 
the ghost and regulate by the dimensional regularisation scheme. 

It would be sensible now to give the correct and complete theory, say as 
outlined in Abers and Lee [10] and many students might prefer this. But to 
get a clear feeling for the need for the Fadde’ev—Popov ghost, we can contrast 
the theory with QED by asking what happens if we just plough along and make 
diagram rules by direct analogy with QED. This is a sort of “damn the torpe- 
does, full speed ahead” approach. We will discover that we are hit by a torpedo, 
but let’s try it anyhow. 

We will attempt to calculate with the theory by first deriving the diagram 
tules directly from the Lagrangian. 


6.2. Derivation of the rules for diagrams 


We shall assume that the reader is familiar with the methods of deriving the 
rules from the Lagrangian and so we will merely give an outline of the deriva- 
tion as we go along. Consider the following model Lagrangian for spin 5 iso- 
spinors y and spin 0 isovectors @ interacting with Yang—Mills fields A, 


L= ZB, Ey, + Vd + 8d Vy — mby 


: 2 
+31, +24, XO, +24, X06] “6, (6.1) 


where we have rescaled A, to gA,, and so now 


E,,=9,A, —9,A, +£A, XA,. (6.2) 
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We separate the Lagrangian into terms of second order and terms of third and 
higher orders in the fields. The second order terms give the propagators and 
the higher order terms give the interactions: 


L> —§F yy Fuy + WY, 9y — DV + §[0,0)°0,6) — M2$"] 2nd order 
+ ey, tw 8A, XA,)° 9,4, + 58 [9,07 (A, XO) + 9,.0- (A, XO*)] 
3rd order 

‘ L(A, xo*) , (A, Xo) — aeX(A, XA,)° (A, XA,) 4th order 
(6.3) 


where we have retained F,,,, = 0,A,, —,,A,, to pull out second order terms in 
A,, which will give the vector field propagator. 
As there is some problem in deriving the propagator for the A,, field, we will 
first derive the diagram rules for the interaction terms. Consider the term 
BWrypAy ty, (6.4) 
this corresponds to the following interaction — y coming in, absorbing or 
emitting an A,, and going out aS a y, viz. 


b 


gens * (6.5) 


y 
In co-ordinate space we take 
w= uye'P1*, W = iiye*'P2*, A, =a,e7'%, (6.6) 


i.e. free waves. Going to momentum space the diagram becomes 


D2. Pe 
a aera (6.7) 
Pp q 


Uy ' 


where the momenta carried by the particles are indicated on the diagram. We 
can now read off the vertex; it is 


8ug(y, ta, uy (6.8) 


This, except for the presence of the t matrix, is identical to QED. In fact, as 
in QED, we do not need actually to write the w,, % spinors for virtual fer- 
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mions, but just string the Dirac (and t) matrices together in a product in order, 


as we come to them following a fermion line. 
We now consider, in a similar fashion, the third order (in the fields) term 
for the scalar field @ interacting with the Yang—Mills field A, 


+38 (8, 0" *(A, XO) +3,0-(4, Xo*)}. (6.9) 
Putting in, as before, the free fields 

b= S,c7iPi*, O° = S,e*!P2*, A =a,e7 1¢* (6.10) 
we get 


+hg{t ipoyS2°@, XS)) — iP, S1* @, X Sa)}expt{—-i(py — Pr +q)x}, 


(6.11) 

then going to momentum space we have 

Sox Pp, 

ee gr +P), 182° @|, XSI (6.12) 

sco 
Similarly, the term 3g7(A,, XO") + (A,, X$) gives 

Ss. p, % by 

= g? [(b, X Sp) @, X $1) + @, X $3) +, X S)))- 
5-7, whe (6.13) 


So far, these interaction are similar to (apart from the isospin labels) those in 
QED. But in Yang—Mills theory there are two extra vertices which have no 
QED analogue (they arise because the A, field itself carries “charge’’). The 
term BBA X A,)- 0, A, gives, in momen tut space, —s(a, Xb, )-e vey 
when we substitute a eax for the first, b,, e'4b* for the second, and 

C, el%e* for the third with momenta o> ap: q¢- Hence altogether it gives, 


taking account of permutations and with a bit of re-arranging of the dots and 
crosses: 


81g ~ We) Oy Cy X a,) + (Fe - Wp) 4y * B, X ¢,) 
+ (Gp - Gadutp * @, X 4,)}- (6.14) 
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There are six terms altogether since each of the three A,,’s can be either a,,, 5, 
orc, inevery possibly way. 


The term ar, X A,)*(A, X A,) gives 


2 
& P 
aoe {(a,X 6,)*(c, Xd,) + three other symmetric 


Brae GG «= Combinations} , (6.15) 


which comes to a total of ~g? (a,, X b,) + (c, Xd,) by a bit of rearranging. 
We must finally derive the propagators for the theory from the quadratic 

terms in (6.3). The propagators for y and @ are easy to get. The equation of 

motion for wy is (obtained from the Lagrangian by varying with respect to y) 


(if- mp =, (6.16) 


where J is some source for wy (whose exact form is irrelevant, e.g. some non- 
linear combination of other fields like A -t y, that in the diagrams generate a 
source away from which a particle y is to propagate) 


b 
v 
out 
A source 
So symbolically 


eek (6.17) 


In momentum space this is 


Vp ao m?P (6.18) 


This gives the y which enters into a source term for another interaction, and 


so each virtual w line brings in a factor (p -- n2)~!. This then defines the propa- 
gator 


> e 


yom 


(6.19) 


The i and j are isospin indices. The bi is present since y must couple to the 
same isospin at both ends. 
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For the @ field, the equation of motion is 


(0+M2)o=—-J'. (6.20) 
Therefore 
1 , 
= 2) 
b= (6.21) 
where we have again transformed to momentum spa. _--'s gives the @ propa- 
gator 
6 
Gaus eels (6.22) 
p2 — M2 


where the a and 6 are isospin indices. 
Finally consider the A,, propagator; the equation of motion is 


30,4, —d,A,) = rads 4 (6.23) 
i.e. 
OA ,, — 0,(8,4,) = —-J, (6.24) 


where J, is the total current — matter plus contributions from the A, itself 
due to the third and higher order terms in the Lagrangian. We need to solve 

this equation to get the propagator. However in general we cannot solve it be- 
cause by itself it is meaningless unless the divergence of J; is zero, for the diver- 
gence of the left side is identically zero. We conclude therefore that 


a,vi=0, 
ie. 
d,{J, + 9,(A,, XA,)+A, x Eye (6.25) 


This equation can indeed be verified using the relation (2.42). 
When we attempt to obtain the propagator in QED, we do so by choosing 


0,4, =0, (6.26) 


as we may do due to gauge invariance. The equation of motion then becomes 
simply 


A,= Jy, (6.28) 


and we can solve this to obtain the propagator viz. 
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mn yey? (6.29) 
so the propagator is 


5 yplP? - (6.30) 


It turns out that when we use this and calculate diagrams in QED, the current 
we get from the diagrams is automatically conserved so that everything is al- 
right, as it must be since 


a,04, =-a,J,, (6.31) 


from (6.28), and both sides are separately zero: so the gauge choice is self con- 
sistent. This is equivalent to replacing the action — pat ae av ues by -3 5 f(a, A,) 
(0,A ayer x, or in other words to adding an extra term +25, A” 2d4x to the 
ae tion: so the Lagrangian becomes simply — Fy Puy t 3 5 (3, y (2, A,,). 

Take a flying guess and try to do the same thing for Yang— Mills —an excel- 
lent method of doing physics. The purpose of physics is to find out what’s 

true, not to find out what you can prove. If you allow yourself the liberty of 
knowing things with different degrees of certainty, then you can know a lot 
more physics than if you have to prove everything. So in (6.24) we suppose 
that we can choose the gauge 


A=, (6.32) 
and we then get the following for the equation of motion 
GA,=-J). (6.33) 


We can invert this equation to get 
Balle (6.34) 


which in momentum space is 


Bay , 
(4,), = os . (6.35) 


This defines the propagator 


8 oad 
Bae seedy 5 et ae (6.36) 
a b p2 


Notice that if we take the divergence of both sides of (6.33), both sides are 
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separately zero, the left side by the gauge condition (6.32) and the right side 
by (6.25) so we appear to be self consistent. We have all the rules and can cal- 
culate processes and see whether we run up against any problems, such as gauge 
non invariance of a physical process. We will, indeed, run into problems when 
we consider Yang—Mills loops and shall see that we need to introduce some- 
thing extra (the ghost particle) to cure them. 


6.3. Explicit calculations of physical processes in Yang—Mills theory 


As an application of the rules that we have derived, we shall now calculate 
several physical processes. We begin with Compton scattering for / = 1 scalon. 
The process under consideration here is Compton scattering of a gluon A 


V1 
and an isovector scalar particle @(scalon) using the Lagrangian (6.1) 


gluon + scalon > gluon + scalon . (6.37) 


We define the momenta, polarisation vectors and isospin labels as follows 


q 
2 2 
So 2 2 

(6.38) 
7 

Comal 7 -~ 
$4 es G1, 
, Py va 


The momenta are p,, p and q,, q@, the isospin labels of the scalons are Sy 
and S, and the polarisation vectors and isospin labels of the gluons are a ,, and 
ay,,- To second order in the coupling constant there are four diagrams which 
contribute to this process, and we will calculate them in turn 


(i) 


(6.39) 
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where we have temporarily put an isospin index a on the intermediate scalon 
to make it easier to apply our rules. The amplitude for this process is (we must 
sum over all isospin directions of o as all will contribute to the diagram) 


] 
2282p) +42) 83" (a2 X6) ——— — 
_isospin (Py +q1) —m 
directions 
Xo- (ay, X S1)(2p1 +41), + (6.40) 


Re-writing S + (@y,, X 6) as (S, Xa>,)-°o and using the formula 
2a(A -s)(6-B)=A-B, (6.41) 


we get for the contribution from this diagram 


2 (2p2 + 42)y (Sq Xa9,)* @iy XS1)2P1 + Dy 


5 (6.42) 
(p; + a) -—m 
The second diagram is the same as (i) with the two gluons interchanged. 
(ii) 
32 Sy 
(p,~49) : (6.43) 
3 - 5, 
one oy 


The amplitude for this diagram can be obtained immediately from (i) if we 
notice that making the substitutions 


q\ **—42, a1, 474,» (6.44) 
in (i) gives (ii). Therefore the contribution from (ii) is 
re: (2p2 — 91), Sz X41,)* @oy XS, )(2P1 ~ W2)y 


(pT = a2) — m? 
The third diagram is 
(iii) 
% 


p. 
eNAS eel (6.46) 
7%, Remy % Ory 


sy 


(6.45) 
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; (Sy XS1)°¢,(P1 +P2)a 
amplitude = ~ —————_—_,__ {4 — py + P2),4ay° @iy X ey) 
(Pp, — P2) 
+ (—42 - Gye, Gry X 41) + (Py — P2tI2)@1y° Cy, X42,)I » 


where ¢,, is the polarisation vector of the intermediate gluon, and therefore we 
must sum over its directions in space and isospace using the formula 


2U(A -¢,)(c,°B)=A Bb, - (6.47) 


The amplitude for this diagram is then 
; (S, XS,) 
Gua {(@z,, X 41,) [(P) + P2),(291 - 92), 


(p1 —P2 


+ (py +P2)y(242 - ayy) — @ay X a1, )(P1 + P2da(a + I2)a}- (6-48) 
Finally we have the diagram 


Sy P2 G1 % 
fe (6.49) 
7, 5, 
The amplitude is 
g {(S> Xa7,) (S; Xa1,) + (S5 Xayy) “(S, Xay,)}- (6.50) 


Adding the contributions from each of the diagrams, rearranging dot and cross 
vector products, putting the scalons on mass shell and using momentum con- 
servation (but keeping the gluons off mass shell for the moment) we get the 
total amplitude for the process to be 


(2p2 +4), (2P; +41), 
2P 19, +45 


(S> Xx @,) “(S| X op 


mee —4)(P1 + Pody + (Py + P2) {242 - Wy 
(41 - 92)? 
, 1 *P2)°@ +92) +1 - | 
(41-9) fecha 


equation continued on next page. 
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(2p — 4), (2p2 - Wy 


#15, X43,)-(SX0ip| 7) 
—2p297 + 94 


. (241 ~ 92)(Py + Pr)y + (Py + P2242 - aD) 
(4, - 42)" 


eeiten saa | 
bvy- 


(41 - qa) 
So what? What do we do with the answer? There are two possibilities — we 
can use it to calculate the Compton scattering — maybe we are interested in 
this; or we can use it to calculate the annihilation of a pair of scalons into a 
pair of gluons (see below). 

In order to get the Compton scattering of real gluons we must put them on 
their mass shell i.e. qy= q5 = 0 but this alone is not sufficient, as we can see as 
follows. Consider the equation of motion for the gluons in the absence of any 
sources (free gluons): 


(6.51, cont’d) 


9,,(0,4, — 9,4,,) = 0 (6.52) 
In momentum space, this becomes 

CC ACE Pt q,4,,) = 0 (6.53) 
1.€, 

qa, — (4,4,)4,=0. (6.54) 


There are two solutions to this equation: either (a) q2 = 0 and we must have 
94%, = 9. Ie. for a Yang—Mills particle on its mass shell, the polarisation vec- 
tor is perpendicular to the momentum; or (b) q? # 0; then a, = (q,,4,/q7)q,, 
i.e, 


a, =a@q,,. (6.55) 


This corresponds in co-ordinate space to A,, = 0,,x which is a pure gradient. So 
we would perhaps expect that this field could be removed by an infinitesimal 
gauge transformation. Let us try this; to gauge it away we would have to have 


O=d0,x+90,a+(0,x Xa). 


Since 0,,x X@ is higher order in the field, it can be neglected since we are 
working with a free wave and hence a,, must be small (otherwise there would 
be self coupling terms on the right hand side of 6.52 which we neglected in 
order to get the free wave solution, since they are of second and higher order 
in the fields). We can therefore take & to be —X and gauge the field away. 
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Case (b) cannot produce any physics. If we calculate an amplitude for a 
physical process first with a,,, and then with a, where a), is related toa, by 
an infinitesimal gauge transformation a, =a, + 0,a, there should be no dif- 
ference in the answers since the theory is gauge invariant. The effect ofa, and 
a, is linear and this means that if we were to calculate with 0,,a the amplitude 
must be zero. 

So we can test for the correctness of the solution to any physical process 
by putting the polarisation vector a, of an external gluon equal to aq,, where 
q,, is its momentum, and checking that this gives zero. Note that we must carry 
out this test for a whole physical process; we should not expect that a single 
diagram by itself is gauge invariant as it is only part of an answer and by itself 
represents no physics. 

In the Compton effect above, we gauge the incoming gluon, putting 


a), =2q), (6.56) 
Do we get zero? We make the substitution, and of course we don’t get zero — 
because we made a couple of mistakes; however fiddling around a bit more and 
correcting the mistakes, the Compton amplitude (6.51) becomes 


; (2p, +41)y 
—8%(S, Xa) (Sp Xin, (a (ree 
ee EIEN G = 


; (—2p) +41) 
—g?(S, Xa) -(S; Kase) a 
41 - 92) 
where 
Foy @2)* 3424 -— 42y(4 7241p) - (6.57) 


Is this zero? Yes, because a>, is supposed to represent a free sourceless gluon, 
so the term jy,,(a7) must vanish by 6.54. If ay, is not sourceless 


scattering 


diagrams (655) 
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Joy (ar) is not zero, so the process we have calculated does not seem to be 
gauge invariant. Is there a fault in the theory? Need the theory be gauge in- 
variant if the ay, has come from a source? Yes, the theory must be — the fault 
lies in the fact that we are not now calculating a complete physical process. 
Consider as an example that the source of a, is another scalon, i.e. 


Compton 
scattering (6.59) 
7 


7 
7 


Then in addition to the diagrams 


a ut (6.60) 


which contribute to (6.58), there are two additional diagrams which we must 
consider for the whole physical process, viz. 


x 
NS oF ~ 


- 
“” 
Ny 
PNAVAVAVAVAVAYs a ‘S ie 
- ~ Dn 
° i. 
ao Lt 

SN 


(6.61) 


When the sum of all diagrams in (6.60) and (6.61) is considered and we put 
a), =%q1,,, we will get zero. It is not surprising that we do not get zero for 
the diagrams (6.60) as they only constitute part of an answer to a physical 
process, and there is no reason why this partial answer should be gauge in- 
variant. The diagrams (6.61) correspond to a modification of the source in 
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(6.58). Therefore, if we always only apply a gauge transformation to a physi- 
cal process we will find that the amplitude is invariant (provided we stick with 
tree diagrams). Hence the diagram rules which we have constructed will work 
as long as we do not calculate any diagrams with closed loops. Before consider- 
ing the difficulties with closed loops, we will consider the Compton scattering 
process further. 

The formula for the real Compton effect is obtained from (6.51) by putting 
both gluons on their mass shells i-e. 


q? =q3=0; G1p41p = %2p%,y =9, (6.62) 
in accordance with the situation (a) earlier. We get 
2g2(S> Xay,)° (Sy su (PipPov) (P1920) 


3 ey) (Py, *4) 
y yf af O72 6.63 
+ (P2,Py,) — uv(P1adiad + P| > —P> - ( ‘ ) 


By taking the crossed diagrams of the Compton scattering process we obtain 
the process for scalon annihilation into gluons 


y 
Lx a 


\ 8 


po <zo7 in the center of mass system (6.64) 


The amplitude for this process can be deduced immediately from (6.63); it is 


2 lpg * poet 
Ja" WLP1°da =P "AW 


| c “ap Aye 1 40)» , (6.66) 
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whese 


Ayy = —G, Xa,)° Sy X8,), Buy = —(S; Xb,)*(S2 Xa,)- (6.67) 


In the centre of mass system, the incident scalons have three momenta p; and 
—p; tespectively and velocity v; and —v; where 


U; = p;lw = pvp? + M2 . (6.68) 


The amplitude for this process is then 


1 — v2 cos26 


We consider the possible polarisation states for the gluon: state (a) in the plane 
of the reaction viz., 


2u;v; 
rg oe Oe HEL PCOS 0)A;;+ (1 — vcos O)Bi, (6.69) 


t x y Zz 
a, =(0, —sin 6,cos@,0)a, (6.70) 


state (b) perpensicular to the plane of the reaction viz., 


txypz 


a, =(0,0,0, Ia. (6.71) 


Apply (6.69) to the following polarisation combinations: 
(i) both gluons in state (b) 


Amplitude = [(1 +u cos 0)A;, + (1 —u cos @)B;;] =8, (6.72) 
(ii) both gluons in state (a) 

Amplitude = 6r(cos @) , (6.73) 

where 
2sin2 6 v2 

t(cos 0)=1 — eae ey ; (6.74) 
(iii) one gluon in state (a) and one in state (b) 

Amplitude =0. (6.75) 


We can calculate B given the isospin states of the scalons. The results are tabu- 
lated below, where / is the total isospin of the scalons in the initial state 


110 l 2 
B{4|2ucos@| 2 


(6.76) 
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Finally we outline the calculation of the Compton scattering of gluons 
from isospin 4 spinors using the Lagrangian (6.1) and the rules which we de- 
rived from it. The diagrams and the amplitudes are 


(i) 


O2.P2 Way Gay 


] 
ory - rs aees . 
& ux(4,, We +g,—m (141, T)Uy , (6.77) 
47P, Soy 
(ii) 
O.\.P2 Q2 Cr 
_ ] 
2 
us W9(41,°tYy) F— "t)uy (6.78) 
uy AP, qd, Pim 
(iii) 
Br uyy, te Uy 
_ — 5 Ia, X 1.) (291 ~ 92), 
Yap, q 3% = (Gy — q2) 
We HG + Cy X04,)242- 41), (6.79) 


— (49, X 41 ,)(9, + 92)yl - 


We do not have the fourth diagram in this case. As in the scalar case, we can 


gauge one vector boson, a1, *%q),,, where upon we get, adding the results 
(6.77) (6.79) 


Sly t yy 


5 Voyulan) Xe] (6.80) 
(4; — 92) 
As before (6.57), /2,,(@2) = 9349, — 92y(92,47,) and again this is zero if a5, 
is free wave, or if it is a pure gradient (c.f. (6.54). However j,,(a) is not zero 
if ay,, has a source (j,, (a2) = sourcey,,), but in this case, viz., 
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\. 
Np 
\2 
N 
Compton 
scattering (6.81) 
’ 
7 
we 
Ww" YON 


We have again another type of diagram where the a), acts on the source: 


eB 
Py 
SS 6.82 
ue (6.82) 
’ 
All the diagrams are taken into account, we again find that the physical 
process is gauge invariant. Hence there is no difficulty here as we expect, since 
these are tree diagrams. 


ar 


7. Quantisation continued — Loops 


So far we have been successful in our attempt to quantise Yang—Mills theo- 
ry. Now we want to try the procedure that we have adopted for higher order 
diagrams; this will entail diagrams with closed loops (the higher order tree dia- 
grams do not present any difficulties). Why not therefore try to calculate the 
polarisation of the vacuum? But wait, if you want to discover a difficulty with 
a theory, you’ve got to look at a physical problem because some of your diffi- 
culties might come from not asking for a complete physical process (see the 
previous section). So the only way to discover whether something is right or 
wrong is not to pick up some arbitrary thing like the vacuum polarisation or 
the vacuum expectation value of a pair of operators, but to ask a physical 
question. But what should we ask? Correction to the Compton effect. This 
would be a fine problem, as good as any, but I decided to look at the follow- 


ing example — the scattering of two scalons by gluon exchange. In lowest order 
this is 


719 


i 
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~N <j . 
2 \ , 
x 
. Pr wea 
~~ _—noen 
a ~~ va ~N. 
~ “7 ~~ 
2 N 2 N 


™~ “~~ ae “N a 
™ a N 7 N 2 
GC ae es > Eee (7 .2) 


We are not actually going to evaluate these diagrams, but only indicate what 
happens. When we calculate the diagrams (7.2) we, of course, have problems 
with renormalisation; but when we’ve straightened these out we would still 
like to check the result in some way. We can do this by checking whether or 
not we satisfy unitarity. 

Consider the following diagram, where a pair of scalons annihilate into two 


gluons 
ANG 9) oy 
(7.3) 
“ Ne 
The amplitude for this process is 
My 1p 2y > (7.4) 


where YC,,,, is some tensor function of the momenta and isospin labels of the 
scalons. (We calculated this process to first order in the previous chapter.) 

If we assume that the gluons are physical particles, there are constraints 
which their momenta and polarisation vectors must satisfy. 
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222 = = = 
q4 = 95 =9, 44°94, =97°4,=0, (7.5) 
ie. there are only two polarisation states for each physical gluon. The proba- 
bility for process (7.3) to take place is <| Amplitude]? ice. 
Probability «(M44 ,,47,)(M hy ay yoy) (7.6) 


where we have taken the a’s to be real. In calculating this we see that it is 
closely related to the amplitude for the following process 


NY 
AS 2B (7.7) 
/ \ 


The amplitude for this process is 


z M,C, X (propagator terms for gluons A and B). (7.8) 


all v 


If we take the imaginary part of (7.8) we get 


se Mes (7.9) 


where the propagators have been removed by this process and the correspond- 
ing gluons are now on mass shell, and irrelevant constants have been ignored. 
Unitarity tells us that this must be the same as (7.6) summed over physical 
polarisation states: 


eee (My 41 2 2v) (Myr y 1 ye AI") : (7.10) 
polarisation 
states 


However note that in evaluating (7.9) we have summed over more polarisation 
states than we would have done had the intermediate gluons been physically 
real particles, so it is not obvious that (7.9) and (7.10) are equal. 

In making the comparison we will only consider the u index. We lose no 
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generality by doing this since if we can sort this out then we can also sort out 
the v index. In making this assumption, we really compare 


M81, May (7.11) 

2 poln. states perla eS 
of a4; all 
vp States 


with (7.9). We can think of (7.11) as follows 


(112) 


The gluons a, and ay are on mass shell since they are physical gluons. This 
diagram, when the gluon line is not connected, is similar to, but more com- 
plicated than the Compton scattering process; it is 


(7.13) 


When we calculated the Compton effect we noticed that when a1, =2qj,, 
then the amplitude was of the form 


B(aX cy) J, (22), (7.14) 
where 

. 3.22 

A(z) = 4992, — 42142) q>, . (7.15) 


In the case of the Compton scattering proper 
_(S, XS7)(P1 + Pada 


(7.16) 
(p, “a Py) 


Cr 
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but in general we may think of c, as being the gluon field produced by some 
source 


eres (7.17) 


Since we are really only interested in the indices, define 
y= t 
Heyyy = 2 My Min, , (7.18) 


so that (7.11) becomes 


~ 


2 poln. Pe aa arts : (7.19) 
of ary 
and (7.9) becomes 
all Muy (7.20) 


Since qi = 0 we can choose the axes so that 


tzxyp 
q,; = w(1, 1, 0,0). (7.21) 
a, then has components only in the x and y direction and (7.19) becomes 
Mh soos os My y * (7.22) 
Similarly (7.20) becomes 
Wor — Wx — Myy — Mae « (7.23) 


Eqs. (7.22) and (7.23) differ by 
Mes i Mat . (7.24) 


So when we took the imaginary part of (7.8) we obtained the extra piece 
(7.24), which we do not want; the theory will be unitary if we can show that 
this extra piece is zero. Rotating the axes we can re-write (7.24) as 


1 = oe 
af Ms — 1), (242) % WM z+1),(2-2)} : (7.25) 


Consider the first term in (7.25): the second term is similar. We would get 
such a term if we considered a process with two gluons which have polarisa- 
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tions in the directions 
] f' zx yp 
Vz: 1,0, 0) Viz- yp =Qip > (7.26) 
and 
Le GO) dee aye (7.27) 
“fo »71,Y,; : Qu ~ Ga*N > : 


We get zero for or (oa) only if j, (a) is zero (from (7.14)). However it 
is not zero although q5 is, since q4 * a> is not zero (the particle is not free). 

In QED, this problem does not occur, because when we have a process with 
two external photons, if we gauge one photon (i.e. put the polarisation vector 
parallel to the momentum) the amplitude vanishes irrespective of whether or 
not the other photon is physical. That is, the quantity analogous to /, (a4) is 
identically zero independently of the state of the photon. 

There is nothing we can do about this problem in Yang—Mills theory. If 
we wish to make the theory unitary we must subtract something to get rid of 
these extra pieces. By taking the simple Compton case, we can get some idea 
of the form of the thing we have to subtract. The Compton amplitude (7.14) 
becomes 


g(aXc,)*aq,, (7.28) 


when we use (7.27) and the fact that q = 0. The polarisation a is summed 
over when we make a closed loop. This is the extra piece that we have to get 
rid of. We can see that the most direct way to do this is to add an isovector 
scalar particle a which is self coupled and also coupled to the vector according 
to (7.28). (This is the ghost particle.) In co-ordinate space this coupling be- 
comes 


g(PXA,)*a,P. (7.29) 


Clearly since the particle has to cancel a piece coming from the A,, its propa- 
gator must have the same form as that of the A,, viz. 1/k2. We can see from 
the form of its coupling that this ghost appears in closed loops in diagrams in 
the same way as A’s, It is not allowed to appear in the initial or final states 
since it is not a physical particle. It was not obvious to me from the type of 
analysis presented above, what to do in a diagram with more than one closed 
loop. (Unitarity is not a sufficient constraint in this case.) In fact the solution 
to the problem for larger numbers of loops is to add the ghost as if it were a 
Fermi particle. (At the one loop level, adding a Fermi particle is equivalent to 
subtracting the contribution from a particle coupled in the same way but 
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having Bose statistics, which is what we did above.) The contribution from 
this ghost could come from a Lagrangian of the form 


£270, P*+d, P+ gd,P*- (A, XP) (7.30) 
This Lagrangian leads to the following diagram rules. 

ghost propagator: 

Comox Db 5 aplk? (7.31) 


ghost vertex: 


k igq7,07°(a, Xo), (7.32) 


a 
where the ghost only enters in closed loops. Topologically for every diagram 
with an A, closed loop there is one with a ghost loop in the same place. Note 
that (7.32) is not symmetric looking and this asymmetry is fundamental, so 
we add a check mark (\/) to one of the ghost legs. We must keep track of 
these ./’s as we go round a closed loop so as to ensure that they are always on 
the same side of a vertex. (It does not matter on which leg we put the V pro- 
vided that we always put it on the same one, since it can be shown that both 
choices lead to the same results.) In order to get the right factors it is neces- 
sary to assume that the ghost is a complex field. 

All these details were proved by Fadde’ev and Popov [26]. We haven’t proved 
them; we have merely indicated how if we proceed in a straight forward and 
naive way, we come into difficulties; and we gota pretty good smell of what 
to do. But for the full glory of the theory we must follow the approach of 
Fadde’ev — it requires more machinery and we will discuss it later. 

Finally we write the total Lagrangian as 


I 


L= -7E,,°E,, -3(0,4,)* +30, P** a, P 
+3g0,P*-(A,, XP) + Matter @, y, A) (7.33) 


Why is the second term in (7.33) present? If we expand the quadratic parts of 
—ZE,, + E,,, which will give us the propagator we get: 

—4(0,A, — 3,A,)*(0,A, — 0,4,)=—700,4,)? +3(0,A,)?. (7.34) 
We have changed the form of the Lagrangian in this by integrating the action 
by parts, throwing away (as usual) the surface terms at infinity. These changes 
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can produce no physics as they do not change the action. We now see that the 
~3(0,A,) term in (7.33) cancels the same term in (7.34) to leave the qua- 
dratic terms in the Lagrangian as 


—$(0,A4,)°. (7.35) 


This is exactly what we want to give the propagator Sabo yylk? which we have 
been using. (This is equivalent to choosing the gauge 0,4, = 0.) That the 
Lagrangian (7.33) is equivalent to the Lagrangian 


L=-ZF,,°E,, + Matter, , A) (7.36) 


(i.e. that the ghost Lagrangian (7.30) compensates for the extra term) isa 
more difficult problem which was solved by Fadde’ev and Popov [26], who 
also showed what form the extra terms in the Lagrangian would take in dif- 
ferent gauges. 


8. Quantisation via the Hamiltonian formalism. A ghostless gauge 


We will now quantise Yang—Mills theory by going via the Hamiltonian 
formalism, rather than by path integrals; a similar procedure has been carried 
out in QED and one disadvantage is that we lose manifest Lorentz invariance, 
although we know that the final results must be relativistically invariant. We 
inust choose a gauge before constructing the Hamiltonian since, as we know 
from our experience with QED, otherwise there are too many variables. The 
Hamiltonian is a machine for telling us how operators change with time; thus 
the important operator is 


8/dt . (8.1) 


But we have seen that in Yang—Mills theory the physically interesting operator 
is D,, which has a time component 


(2+ 45%). (8.2) 


Hence to recover the operator (8.1) which is appropriate for the Hamiltonian 
formalism, we want to put 


AGaO: (8.3) 


Since we have the freedom of gauge transformations we can make this choice. 
If we do not do this the Hamiltonian formalism becomes extremely cumber- 
some and very difficult. By the way, why don’t we make this choice in QED? 
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If we do, then a static charge has a vector potential associated with it which 
rises linearly with time. This is an inconvenience. However, in QCD, perhaps 
there are no static charges, i.e. we imagine that hadrons cannot separate into 
quarks or, when doing a problem, we start initially with no matter present and 
then create the matter and anti-matter (so that by doing this we ensure that 
the total colour charge of the world stays zero) which exists over a finite time. 
There seems to be a danger in this non-linear theory, since if the vector poten- 
tial rises linearly with time, certain cross terms may produce physical effects 
(e.g. by quantum fluctuations); of course that’s a dream, since if it were true, 
we could say that to avoid these difficulties, the vector potential ts not al- 
lowed to rise linearly with time for ever, and hence we have proved colour 
confinement. 

We shall write the classical equations for the Yang—Mills field, construct 
the Hamiltonian, having defined the conjugate momenta, and then quantise 
the theory by using the canonical commutation relations. The classical equa- 
tion of motion is (2.28) 


z matter _ 
Oy op r An hE ao 
where 
Eng 0, A, =O, Ayr A KAY (8.4) 


The E,,,, tensor has six (Lorentz) components. Since we have singled out the 
time axis by our gauge choice, consider the ti component of sire Cea ee Zz). 
From (8.4) 
0A; 
J ee ys Cc; r (8.5) 


HT 


where by definition €; is the colour electric field (named using the analogy 
with QED). Similarly the ij components are (from (8.4)) 


where by definition B, is the colour magnetic field. Using this definition, (8.6) 
gives 


x 
B=curlA+3Ax A. (8.7) 
Look at the i component of (8.4). This now reads 
aA; x 
>t (curl B); = —J; —(A x B);, (8.8) 


where we have introduced the notation that the upper vector symbols refer to 
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isospace and the lower symbols to real space. 
These three equations can be written generally as 
32A x 
—+t+curlB=—-J—-AxB. (8.9) 
ar? ™ aa 
The fourth equation comes from taking the time component of (8.3); this is 
7) x 
S(V-A)=-9-A€, (8.10) 


where we have put 
p=Jo. (8.11) 


If we solve (8.9) we do not have to solve (8.10), because (8.10) is practically 
a consequence of (8.9); we can see this by taking the divergence of (8.9) and 
re-arranging the terms: 


Cine) te) : xX, 

aa 4 |" a4 ‘e) -Bu-4%s. oe) 
This is the same as the time derivative of 8.10 provided 

Sr tA cd. (8.13) 


The matter will be such that we have current conservation (in the covariant 
sense) and hence this equation will be satisfied. In principle, integrating (8.12) 
with respect to time to get (8.10) (using (8.13)), we could generate a function 
of x,y, and z. We assume that the initial conditions are such that this func- 
tion is zero; it will then stay zero for all time. We can therefore forget about 
(8.10) in solving the theory. 

We can solve (8.9) for A(x, ¢): (8.9) is just a complicated non-linear dif- 
ferential equation for A;. The dynamical features of the theory are simple 
since the only derivative of A; which appears is the second time one, and this 
appears linearly. Can we find a Lagrangian from which (8.9) comes? It’s easy, 
because we know all about Newton’s laws and to get a term like 82A ,/at2 in 
the equation of motion, we know that we must have a term like (0A ,/dt)? in 
the Lagrangian. This is just like having the square of the velocity in the Lagran- 
gian; the equation of motion then has a term linear in the acceleration. In 
order to generate the rest of the terms in the equation of motion we proceed 
as follows: the term J; is generated by a Lagrangian term of formJ,; Aj, 
and it turns out that the remaining field piecesiie. VX B+A xB can be de- 
tived from a term of the form B - *B. So the Lagrangian ‘has ce form 
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ea ve S28 Bit; A;. (8.14) 
This is clearly the same as the Lagrangian 
—4E uy Ey, + Matter , (8.15) 


when we put Ay = Oand use the definition of B, given earlier. What about the 
Hamiltonian? Nothing could be simpler; define the momentum conjugate to 
A; by 

1 


Ny == ~ a (8.16) 


To get the Hamiltonian take the square of the momentum and add the poten- 
tial energy, viz. 
ae fistes 2 
H=5i, I; + 3B; B,-—J; A;. (8.17) 
We now quantise the theory by imposing the canonical quantisation condi- 
tions, and treating Mand - as operators 
[112 (x, 1). APCy, t)] = 169584753(x — y). (8.18) 


What would the diagrams look like for such a theory? We can get the propaga- 
tor from (8.9) by looking at the linear part. This is substituting for B using 
(8.7) 


382A 


~ 


x x 
+ curl (curl A) = —J —A x B-scurl (A x A). (8.19) 


~ 


ar? 


All the terms on the left side are linear in A and all the terms on the right are 

higher order. We can solve this to obtain the propagator which is 
by eee) ; (8.20) 

w? — kjk; 

This proof is left as an exercise; it is simple to complete using the method we 

described in sect. 6. 

What abont the coupling terms? They are the same as before except that 
there are no time components. This set of rules then gives the correct answers 
for Yang—Mills problems. 

No fooling around! No ghosts! No nothing! Isn’t that wonderful? Yes it’s won- 
dertul, but it’s also mysterious because all the rules depend on choosing a time 
axis and it’s not clear that the answer to a physical problem will be relatavisti- 
cally invariant. So what? We know that it must be relativistically invariant be- 
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cause the theory we started with was. True, but the theory diverges and we 
encounter the delicate problem of imposing renormalisation without destroy- 
ing Lorentz invariance ~ a problem which took 20 years to solve in the case 
of QED. It shouldn’t take us this long to solve the problem in Yang—Mills as 
we're much smarter how. 

Despite these problems these are perfectly good rules and we can go away 
and calculate with them. 


Suppose we try to compare closed loops with this propagator and with the 
propagator we lrad earlier viz. 


Or ae (8.21) 


In using (8.21) we are counting the contribution from Ag; but there are super- 
fluons degrees of freedom; therefore we must subtract something to get rid of 
them — this something is the ghost. In some sense therefore, the ghost is the 
difference between the result one gets by using (8.20) and (8.21). What we 
subtract doesn’t look relativistically invariant. The subtraction will be differ- 
ent depending on the orientation of the axes; in this case we have singled out 
the time axis and therefore if we rotate the axes things will change. The prob- 
lem of showing that the results are independent of the method of subtraction 
was Solved by Fadde’ev and Popov [26]. 


9. The equivalence of different gauges 


The purpose of this chapter is to make clear the ideas involved in Fadde’ev 
and Popov’s demonstration of the equivalence of different gauges. | shall not 
prove anything in great detail as this would take too long. We will start with 
the formalism of sect. 8, the Ag = 0 gauge, and attempt to transform it so as 
to obtain the formalism in different gauges; in particular we will try to obtain 
the system in which the propagator is 5, ,/k? (0,4 es O gauge). The simplest 
thing to do would be to write down the rules in the 4, = 0 gauge and then 
transform them to the rules for the 0,,A,, = 0 gauge. In order to get more 
generality and check that the formalism works in all orders we will use a more 
central approach via path integrals. What we shall do is to construct a path 
integral for the A, = O gauge, transform the path integral to the other gauge, 
and then get the rules for the other gauge from the path integral. We shall see 
that these rules are the same as those obtained in sects. 6 and 7. 

As everybody knows, if we have a Lagrangian of the form 


=mt _ va) (9.1) 
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Then the amplitude for the particle to go along a particular trajectory g(r) is 
[18] 


exp(iS/f) , (9.2) 


where S' is the classical action obtained from (9.1). The total amplitude for 
the particle to go from a to b is obtained by summing over all trajectories con- 
necting a and b, viz. 


i p[mdg(n? 
Amplitude « f cx Lf t : = viao)| ar}oat (9.3) 
If we have a system of several variables q;(t), (9.3) becomes 
i miG,(t)? 
Amplitude «feof >| 5 vea,(e dt}Dq,(t) . (9.4) 


In a field theory the variable i becomes continuous and is denoted by x. Eq. 
(9.4) then becomes 


Amplitude « f cx f 2eayet xa] DAC t), (9.5) 


where (A) is the Lagrangian density for the theory and the integral over x 
replaces the sum over / in the limit ¢ continuous 


j 2 
L(A) (249) _ VA). (9.6) 


Eq. (9.5) means this: if A is distributed in some way in spacetime, there is a 
certain action for this distribution which we can calculate. The classical theory 
corresponds to the distribution which makes the action a minimum. Quantum 
mechanics corresponds to adding the amplitude corresponding to the actions 
for all possible distributions. We can deduce the classical limit by noticing that 
in the integration over field distributions, the amplitudes corresponding to 
near Stationary values of the action add coherently whereas those far from the 
minimum are changing so rapidly that the amplitudes from neighbouring dis- 
tributions tend to cancel. In the limit of action S/f being very large only paths 
near the minimum contribute to the integral over distributions and so we ob- 
tain the classical theory. 

We now return to the Lagrangian density which we derived for the Ag = 0 
gauge in the previous section (8.14) and substitute it in (9.5). This gives 

el (OAS DAG Ce a 
f ox yk | a pacer pes ad ‘ala XDA(X,1)P matter - 


(9.7) 
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D A(x, 0) is defined to mean Il; ¢, x ,1dA?(x, t) and in D atte We have to 
integrate over the spin § y field. This means that we must use the path inte- 
gral using Grassinan algebra, which we menitioned earlier. 

Unfortunately from the path integral viewpoint, the expression is not mani- 
festly Lorentz invariant. However we can write (9.7) as follows 


fexp if babe Pats, Ay idee 5(Ag (x, )) DA, (x, yD Matter » 


(9.8) 


where 5(Ag(x, £)) is defined to mean IT yy 5(AG(x, t)) and in this expres- 
sion the pinctiaial integral runs over all four space-time components of 
A(x, t). Although (9.8) and (9.7) are equivalent (by construction) we note 
that in (9. 8): 
(i) Eyy E , iSinvariant under all gauge transformations, 
(ii) J, A, nis also invariant under gauge transformations provided that the 
matter fields transform in the correct way: 


(iii) DA, is also invariant under an infinitesimal gauge transformation 
A, 7A, +d,a+aXA,. (9.9) 


We can sce that (iii) is true by the following argument. Since we are consider- 
ing DA, i.e. an infinitesimal, we can drop the 0,% term since it is higher order 
in infinitesimal. DA , is a volume element in isospace 


DA,= I] dal(x, )da2(x, )d43(x, 0). (9.10) 
HX, t 

The remaining gauge transformation in (9.9) (@ X A ,,) isa rotation in isospace 
and volume elements are invariant under such transformations (cf. dx dy dz 
which is invariant under a rotation in three dimensional Euclidean space). 
Hence ‘DA ,, is invariant. 

We have therefore shown that (9.8) is gauge invariant apart from the 6- 
function. What happens to it under a gauge transformation? Under a gauge 
transformation (not necessarily infinitesimal), 


Ay > GAg , (9.11) 


which in general contains the space as well as time components of A, . Since 
any A, distribution can be gauge transformed so that A, = 0, the 6-function 
is really more universal than it looks. Consequently we can get to any other 
gauge by suitably choosing G. 

AS a specific example, let us choose to transform to the gauge 
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0,4, 70. (9.12) 


We will find that when we transform to this gauge, 
5(Ag) > 6(0, 4, ) F(A) (9.13) 


and we will also find that the factor F(A) can be written in such a way that it 
exactly reproduces the ghost term (Lagrangian (7.30)) which we required in 
the 0,4, = 0 gauge. 

There are several mathematical tricks which we will need in what follows. 
Because of the vast number of indices flying around, we will often use a sym- 
bolic notation. 

Mathematical trick J. Since almost all the questions about path integrals 
which are important are concerned with changes in the fields, the overall nor- 
malisation of the path integrals is irrelevant. We can, therefore, ignore any 
constants (including infinite ones) which appear as multiplicative factors out- 
side the path integral provided that they do not depend on the fields. 

Mathematical trick 2. Suppose that we have a number of variables x; and 
a number of functions y;, which depend on the x;. If we wish to transform 
from the x; to the y; then the 6-function transform as follows: 


Ov; 


= 
5(¥; —»(0))= [pes | ca ; (9.14) 
sas 


or 


alee) 0 
5(x;) = Det 7 bi (; a AA )), 
J x; =0 

where as before the 5-functions are the generalised functions, i.e. 5(.y;) means 
11,5(.y;). This can be seen by extending the result for one variable y = f(x) viz. 


Py 
5(y)= iro: (9.15) 


to several variables. 
Mathematical trick 3. Consider the integral 


fexp(-zep?)dp =V2n/c. (9.16) 
We drop the /27 factors in what follows since they will be irrelevant because 


of mathematical trick 1. Let us now have several variables and look at the inte- 
gral 


CEL 


-d19Y] PUe Q= THM asnes ay) 01 Q = Op adned ay wold wioysuRy 0) BULAN 
JIB IM (OTS) Ul UOLsSaIdNa ay} $2 WIOJ BWR IY) SLY LB ay) UO W192 BY, 


(€2°6) falls td = (Expy ~ (fa yr fd 
138 am Jd Aq BurAyduyjnut uO ox! HOS) =!) naurjap aary am a1aym 

(276) elation depois ixjyt 
JapIsuod 
1 syuatatyyaoo ot) Surpuly 0} saut0d 1F UdYM JNyasn st UoLENba JYIOUY “RIB 
-d]Ul ed B O]UL JUBUIWUIII9P dy} UINI O} (1 7'G) ASN W9Y? LM apy “(0197 st (Q)!e 
aseo siyi ut) (7 °X) "pe Butaq se! pue (7X) p Bulag sv fx apquuea amp jo YUN 
wed am AyeIjOquids “(1 °6) UL TeYI O7 zeus uRIgooeL edn youd pM am “PIs 


-9] UI FeUOLDUNY 9Y? Ul ("y7e9 ay) 07 (OF)g ay lo UHoysUPI) aM UDYA 
"SOLISIBIS LUIZ sey Isoys ay) TRY) UOSRal ou asNf st (} 7G) YL INO WIN TPM I} 


7 ft i 
(176) fy Jaq = 'Op I] Yo"%0 re ‘fos 


*9SRO SLY] UL LIBaIU UB Aq ROUT aM 
JEYM JO UOrTUTFAD sTQRIINs b YM “TRY DULY om “(eIqosyL URLUSSPID & sonNpoNut 
am ‘a'!) sonsiieys tua yn !O ve JOJ yeABaqUL otULS 941 OP 07 Al} aM Jj 


(0¢’6) eae ldp il (at! are t +) exo f 


:saraddesip 1001 airnbs au pure sojqeumea Auetu se dtm) 

19A0 BuTpRAsaquy Appa air am vay} Noyduroo st /d yy sosacke om JEU presp 
SLXE DW ay] JO UOURTOL YOIA PB YRUT aay IY popraoid) “> xtaqzeut ayi Aq 
poquoasap ARM P Ul SuljorsojUr aporurd ssog e& 10} SoluURTaU LUNyURNb ur yess 
-aqut red & ayy st (616) “bY “(81° 6) Aq Parrjar aae Ss, OIL PUL S.7 ay aa] 


fi , 
®) roa ! a 
(616) fomeskes << -=/dp jy((a'ora re i dxa f 
SANOIIG (L1G) Ud 
f 
(81°6) : ‘afore ald 


‘sf asay] UO UONRUNOJSURI) IPaUI] BR OD IM JI 
POAL | anyea aif sry Apieayo sty 


(L1°6) ldp 1] (astare :-| dxof 
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fore we need the matrix 
A(d,4,)/AA 9) : (9.24) 


which is analogous to the matrix dy;/dx;. This matrix A(a,A, (y) A g(x) 
has infinite dimension corresponding to the continuous variables x and y, and 
in addition has isospin labels. If we try to evaluate (9.24) directly, we will run 
into difficulties since the two gauges are linked by a finite gauge transforma- 
tion which depends on the original fields. (We know that finite gauge transfor- 
mations are difficult to handle.) The derivative dy; /dx;, can fortunately be ob- 
tained in an indirect way. Suppose y; and x; are changed by changing a variable 
t, We may be able to evaluate dy;/dx; by first evaluating dy; /d¢ and dx;/dt 
and then using 

dy; dy; de 

dx, — dx," (9.25) 
It was the very clever idea of Fadde’ev to extend this idea to evaluate (9.24): 
in this case he took ¢ to be an infinitesimal gauge transformation. If we do an 
infinitesimal gauge transformation on Ag (gauge parameter a) we get: 


Ay = Ap taX Ag t Joe. (9.26) 
Transform 0,,A,, by an infinitesimal gauge transformation with parameter 6: 
0,4, = 9,4, + BX 0,4, +9, BXA, +9,0,B. (9,27) 


As we are trying to use an equation of the form (9.25) to evaluate the matrix 
(9.24), we would naively think that we should take a = B. But wait a minute, 
What we really need is for the gauge parameter B to be @ gauge transformed 
by the finite gauge transformation G, defined by (9.11). (All the fields are 
rotated by the gauge transformation G and so what we really want is to know 
how the fields Ag and 0,,4,, change in terms of the variables a and Ga respec- 
tively.) However we will use @ for a gauge transformed for ease of notation in 
what follows. (This sublety is not vital to the argument and must not be al- 
lowed to confuse us.) Since Ag = O in the original gauge, the change in the 
original Ag produced by @ (from (9.26)) is 


doe. (9.28) 
The matrix (9.23) is to be considered as the product of the two factors 
A(d,,A,) A(a) 
Eg aay Nae A 
Aa) and AA) (9.29) 


So we have really factored the matrix which described how one gauge varies 
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with respect to the other gauge into two parts, each of which tells us how one 
of the gauges varies with respect to the infinitesimal gauge transformation @. 
The second of these two factors is independent of the fields since A(A Q) is 
independent of the fields (9.28). Using maths trick | we can drop this factor. 
The change in 0,,A4,,, is from (9.27): 


0,0XA, +90,0,0, (9.30) 


where we have used the fact that 0,4, = 0 and we have put B= a. This de- 
pends on the field A,, and so we cannot get rid of the A(d,, A ,)/A(e) term in 
the same way as we got rid of A(Ag)/A(a). 

We can now write (9.8) as 


Josh i, [aE A Ea + J A,) ote] 0A, FA, IDA, Weare 

(9.31) 
where F(A,,) is the Jacobian for the transformation from 6(A g) to 6(4,,4,,) 
which we have identified as 


A(a,,A,) 
Ret ace (9.32) 


This can be written in the form 
ferol5 [P*,3, =A, Xx a,)Pats | DP | (9.33) 


where P is a Fermi particle. We can show this as follows. Let (9.32) be sym- 


bolically Det c;;; this can be written, using maths trick 3 (9.21) as 


fexp(Pic;,P) i dP, (9.34) 


Use (9.23) to re-write the exponent of this, giving 
fexp{—2P? Li) +P) —»Gplt Hae,. (9.35) 
1 


We now convert back from the symbolism to the A’s, putting as before y; > 
0,4, and x; Ag: 


fexp{-3P*4@, 4, )} DP. (9.36) 
Substituting for A(d,, A,,) from (9.30) and carrying out a Wick rotation, we 


get eq. (9.33). Integrating by parts in (9.33) and noting that 0,4, = 0 by the 
choice of gauge, we get 
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Josolh fe, "2,2 +a, 0°, XP) d4x} DP, (9.37) 


and thus we have obtained the same propagator and coupling as we did in 
sect. 7. 


In general, if we wish to transform to the gauge f(A ,,) = 0, we first calculate 
f(A, + Dya)= fA, +6 A,) Dua, (9.38) 
cf. (9.27) for the case fA,) = 0,4, Going through the above procedure 


5(Ag) > 8 [F(A,,)] exp Lf [f'(A,)D, JPatx DP, (9.39) 


To summarise, we have transformed the path integral (9.7) to 
fox [l-tE Rs Mar re sO 0. P Poor A, x Pyatx| 


X 6(3,4,)DA,DP Dyrurter- (9.40) 


People have tried to make rules from this, but there is an easier way. From 
(9.14), we note that the determinant is independent of the value of v;(0) in 
the 6-function. Hence if instead of choosing the gauge 0,4, = 0 (i.e. 1";(0) = 0) 
we chose the gauge 


0A CX Bl) (hen (0) = 8) (9.41) 


the only changes this will produce will be to replace 6(0,4,,) in (9.40) by 
5(0,4,, ~ B), and add an additional piece to 9.30. The last piece produces an 
additional tern 


5P*(PXB) (9.42) 


which will cancel the extra term appearing in 9.37 when we integrate by parts 
since now 0,,4,, =B. We now that we can multiply 9.48 by the constant 


exp(- mae (9.43) 


as this does not produce any difference in the theory. 
Integrating over B using 5(0, A, --B) we obtain the following path integral 
forolif Paki Eee GA tae 2 a? 


+70,P*"(A, X juts] DA, DPD go : (9.44) 


737 
GAUGE THEORIES 203 


We have recovered the Lagrangian (7.33) which led to the rules in the 0,4, = 0 
gauge, and we have therefore sllown that the gauges Ag = O and 0,4, = Oare 
equivalent. 

It is possible to choose other gauges by replacing the term 3(0,4,) *(6,A,,) 
by (A/2)(0,A,,) (0,,A,,), where A takes any value between O and |. This of 
course will produce a different effective Lagrangian. (What people really do 
when they choose a gauge is to construct an effective Lagrangian and then cal- 
culate with it.) This produces a propagator for the A, of the form 

Buy — (1 - (K, kK, /k?) 

ee = 
which of course must produce the same physics. The modified propagator Is 
useful when it comes to proving the renormualisability of gauge theories (about 
which | would have said more if | had more time) since if we take the A= | 
gauge we can then use naive power counting (i.e. the smart thing to do). Naive 
power counting tells us whether something has a danger of diverging — it 
doesn’t prove anything: theories which look divergent by naive power count- 
ing may not be, due to cancelations among the divergences. In some gauges 
naive power counting makes the theory look horribly divergent but we know 
that this cannot be the case. In the \ = 1 gauge, there are no powers of mo- 
menta in the numerator of the propagator and so naively the power counting 
is better. 

Fadde’ev and Popov, in fact, proceeded in the following way, without choos- 
ing a Specific gauge. Blindly if we were to write the simple simplest possible 
path integral for the theory, it would be 


(9.45) 


oso [aE Bayt y Agletx| DA, D Gores: (9.46) 


However, if we make a gauge transformation on the exponent, it does not 
change; so when we integrate over the A ,,’s which are connected by this gauge 
transformation, we will get infinity. An analogy is when we evaluate the integral 


fexp{—a(x? + 2xy + y2)}dx dy (9.47) 
Changing variables tox — y and x ty, this becomes, up to a constant, 
fexp{—a(e + y)?}d(x +y)d(x ~y). (9.48) 


Since the exponent is independent of (x -- y), the integral over (x — y) diverges. 
A similar thing happens in the path integral except that the situation is more 


complicated: the Jacobian in the change of variables depends on A ,, and this 
is what introduces the ghosts we have been discussing. 
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It is shown that if, in a calculation of high-transverse-momentum (p,) meson production in hadron-hadron 
collisions, one includes not only the scale-breaking effects that might be expected from asymptotically free 
theories but also the effects due to the transverse momentum of quarks in hadrons and further adds 
contributions from quark-gluon and gluon-gluon scattering to those of quark-quark scattering then the results 
are not inconsistent with the data. The approach yields the correct magnitude and an apparent approximate 
1/p’ behavior in accord with single-particle data for the energy range currently observed. Two-particle 
correlations are examined. Because of scale-breaking effects and the presence of gluons, the theory does not 
have the problem of predicting too many away-side hadrons at large p, as did an earlier quark-quark 
scattering “black-box” approach. We conclude that the quantum-chromodynamics approach is in reasonable 
accord with the data although theoretical uncertainties (especially at low p,) make incontrovertible 
conclusions impossible at present. Crucial tests of the theory require higher p, than are now available; 


estimates for this region are made. 


J. INTRODUCTION 


We investigate whether the present experimental 
behavior of mesons with large transverse mo- 
mentum in hadron-hadron collisions is consistent 
with the theory of quantum-chromodynamics (QCD) 
with asymptotic freedom, at least as the theory is 
now partially understood. It is shown that if things 
behave more or less according to current theo- 
retical ideas, the experimental data at high p, 
would be explicable with reasonable choices for 
currently unknown quantities (such as the dis- 
tribution of gluons in the proton and the fragmen- 
tation functions describing gluon jets), The theory 
of QCD might provide an adequate explanation of 
all the experimental results that we have discussed 
in previous papers (hereafter referred to as FF1! 
and FFF’). 

We and others*~® investigated this large-p, ex- 
perimental behavior phenomenologically. In par- 
ticular, the view that large-p, mesons are gen- 
erated by hard large-angle collisions between 
quarks present in the initial hadrons has been 
found to be very fruitful. The outgoing quarks are 
assumed not to come out freely, but to distribute 
their momentum over a series of hadrons that 
form a “jet” in the general direction of the “origi- 
nal quark,” 

This view when compared in detail with all 
available experiments is found to be successful 
in many regards. In particular, the distributions 
of particles in single-arm experiments and the 
large ratio of jet to particle cross sections are 
successfully interpreted or predicted. To do this, 
the differential cross section for quark-quark 
collisions, dé/d? (“black-box” cross sections), 


was taken to vary as f(?/§)§~4 in disagree~ 

ment with field theory which expects §~? behavio:. 
The size and angular dependence, 2300 mb/ 
(—$#%), was chosen empirically. In addition, in 
FF1 and FFF, the effects of collisions of gluons 
(as constituents within hadrons) were omitted. 

At that time, there was no experimental evidence 
to require their existence. 

The f(#/§)$~4 behavior was chosen as a direct 
result of assumptions that parton distributions 
scaled with energy and the observation that ex- 
periments done at two or more different center-~ 
of-mass energies W but at a fixed x, =2p,/W 
and fixed angle showed the invariant cross section 
varying as p,~*. 

It is necessary to include a transverse momen- 
tum, (2,),.,, of the quarks within the initial 
hadrons to account for much of the data?!°"™; for 
example, observations in two-arm experiments 
of Pou (Ref. 15) (i.e., lack of coplanarity of the 
beam, target, and two outgoing hadrons).’® Some 
of the apparent large p, of a hadron can be due to 
the initial transverse momentum of the incoming 
quarks. This vitiates the direct scaling connection 
between a di/di =f(i/s)s~" and the invariant cross 
section behaving like p,~*" at fixed x, and O¢m.. 
However, as long as (k,),.., is 500 MeV or less, 
the effects on the single-particle invariant cross 
section are not great and in FFF we limited our- 
selves to values not higher than this. 

There are, however, two serious discrepancies 
with experiment which indicate that, in spite of 
the successes, something is wrong with the black- 
box model. First, recently measured values of 
P.,, seem to be higher than expected so that 
(k y+ Must exceed 500 MeV. That would mean 
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that the scaling argument leading to $~‘ for the 
quark-quark cross section was wrong in the range 
of data used. That is, the p,~* behavior of ex- 
periment would be accidental and not fundamental. 
For example, a quark-quark cross section vary- 
ing only as $~? could (with a (k.),... big enough 

to explain the large P,,’S observed) produce an 
apparent p,~* hadron scaling. 

Second, in two-arm experiments with events 
triggered on one side by a high-p, hadron (the 
‘toward”’ side), the number of particles with large 
p, on the opposite side (the “away” side) was ex- 
perimentally only about § of the number predicted. 
Some of this is accounted for by increasing 
(k)p+q but not all, by far. The only explanation 
available in the approach is that the outgoing mo- 
mentum on the away side is distributed more 
softly (distributed among more hadrons of lower 
momentum) than is typical of a quark; that is, 
there is more than one component contributing to 
the high-transverse-momentum jets. This is evi- 
dence for the need to include gluons as well as 
quarks in the description of high-p, phenomena. 

It requires both gluons and the assumption that 
gluons fragment into a distribution of hadrons of 
lower average momentum than does a quark. 

These discrepancies lead us to include gluons 
in an analysis of high-p, hadron-hadron collisions, 
and to the further suggestion that QCD field theory 
might not be inconsistent with what is observed. 
Although with large values of (k,),..,, the scatter- 
ing cross section behaving like f(?/S)§~* will yield 
a p,~* behavior over the range of data, this still 
differs from the naive field-theory expectation of 
~?. (Including gluons does not help produce a 
p.~* behavior.) But in the theory of QCD, there 
are a number of small scale~breaking effects to 
notice, The effective coupling constant falls 
logarithmically with energy. The incoming parton 
distributions should not scale, but at high x should 
fall and at small x rise as Q? increases. Effects 
in this direction are already seen in ep and up 
scattering and have been analyzed in Refs. 17-20. 
An analogous modification of the fragmentation 
functions Di(2) is also expected theoretically. 
None of these effects alone change the effective 
apparent p, power index Ne in p,~"eff very much 
and yet they all work in the same direction and 
together, as we show, they can change Ney from 
the naive 4 to about 6 in the energy range of pres- 
ent experiments. (The scale breaking due to the 
large value of (k,),., then brings Nz to about 8 
over this range.) 

Thus the possibility exists that QCD can provide 
the full explanation of all the high-energy high-p, 
experimental results. We analyze this possibility 
in this paper. Some of our findings have been 


Ss 


presented in Refs. 21 and 22. We wish to explain 
our approach in detail here. The net result is to 
demonstrate that this possibility is very real.” 

The main problem in such an analysis is that 
no complete calculation of a prediction for QCD 
for any phenomenon—even qualitative ones such 
as the confinement of quarks—has yet been made. 
At present, the mathematical complexities are 
still too great. However, at very high energy or 
high momentum transfer Q, the theory is asymp- 
totically free; the effective coupling constant falls 
with increasing Q?. As emphasized by Politzer,”® 
this permits calculation of those parts of a col- 
lision involving high Q?. Yet every real process 
involves high and low Q? together and the precise 
separation of these parts and hence exact definition 
of the theory for hadron-hadron collisions is a 
problem for the (we hope near) future. We shall 
proceed here ina preliminary way. 

What we do is include all the ingredients thought 
to be present from existing ideas on QCD. We 
assume that the effective coupling behaves as 
a, @?) =127/(25 In@?/A?) with A determined from 
the scale breaking observed in ep and up col- 
lisions, The distribution of constituents 7 (quarks 
and gluons) in the proton G,..;(x, 9?) and their 
fragmentation functions D}(z, Q?) are given a @ 
dependence in accord with QCD analyses of ep 
and up collisions. The theory gives formulas for 
scaling violations (Q? dependences), but thé func- 
tions must be known at some nominal reference 
momenta, say ,”. The distributions G,_.,(«, Q,”) 
are determined from fits to the ep and up data. 
For the quark fragmentation functions D(z, Q,”) 
we use the distributions of our previous quark-jet 
paper?’ and take@,?= 4 GeV’. We hope that data on the 
quark fragmentation functions from e*e”, ep, up, 
or vp experiments will soon be available to test 
the @? dependence expected from QCD and to allow 
for a more precise determination of them. 

For the fundamental constituent cross sections 
for quark-quark, quark-gluon, and gluon-gluon 
scattering, we have taken the first-order pertur- 
bation scattering expected from QCD*®”9 (see 
Table I) and normalized absolutely by the effective 
coupling a(@?), This replaces the arbitrary size, 
energy dependence, and angular dependence of the 
black box in our previous papers. 

Thus it would seem that we have little freedom 
of arbitrary choice. This would be true were it 
not for the distributions of gluons in the proton 
Gro *, @,), and the distribution of hadrons in a 
gluon jet Di(z, Q,?), at the reference momenta, 
These two functions are not constrained much by 
other experiments and are thus essentially arbi- 
trary. We have chosen these with an eye to ex- 
periment. In particular, we have chosen Di(z, 9,?) 
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TABLE I. Cross sections for the various constituent 
quark-quark, quark-gluon, and gluon-gluon subpro- 
cesses.? The differential cross section is given by d6/dé 
=ma,(Q)|Al?/s?, where @,(Q”) is the effective coupling 
given by Eq. (3.1). 


Subprocess |AP? 
1l. 4393 > 939; 32492 
G;43— 4:9; 9 FP 
(i #3) hs 
4 (840 24h? 8 3 
25 93494 T:49; 9 2 + a "9" at 
= = 4/3240? F240 8 2 
3. 94s — 1:45 7 pt Be —O7 
= 32 (a4? 8 / a+? 
fe ase BS 27 \ a J 3\ 2 
= a a2+f? 3 arp? 
5. 88> 4545 6 it oars Ce 
4 (22+3? r: 2245? 
6 GE UE ZG aes te 
9 at 23 3b 
7. 88 ~ 88 “ \8- Sr Fe ar 


4 This table is identical to that in Ref, 29. 


“softer” than D'(z, Q,”) because of experimental 
features of high-p,; processes, and our success 
depends on this choice in several of the compari- 
sons. 

The theory of QCD also may explain how P,,, can 
be so large.?®°°-%3 For example, sometimes two 
quarks hit and scatter to two quarks plus a rela- 
tively hard radiated gluon, so that the two out~ 
going quark jets are out of momentum coplanarity 
by the momentum of the gluon. These effects can 
be calculated, and we are engaged in such calcu- 
lations. What we have done here is a temporary 
expedient. We have simply taken the k, distribu- 
tion measured for y* u™ pairs in pp collisions 
(where similar gluon emissions are possible) as 
a measure of an “effective” k, of quarks in the 
initial hadrons to mimic the effect of such 2-3 
constituent processes. This is not precisely cor- 
rect and the @ and x dependence of the high-k, 
tail to the effective transverse momentum of 
quarks in the hadrons is not handled properly in 
this manner. We hope to improve on this at a 
later date. 

We begin in Sec. II by reviewing the successes 
of the quark-quark black-box approach and to 
examine closely its failures, The ingredients 
used in our QCD approach to high-p, processes 
are explained in Sec. III and the results presented 
in Sec. IV. We reserve Sec, V for summary and 
conclusions, The agreement with experiment is 
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very satisfactory. Quantum chromodynamics 
might well be the correct theory behind these 
phenomena. 


Il. THE QUARK-QUARK SCATTERING BLACK-BOX 
APPROACH! 10 


A. Successes 


In spite of the rather ad hoc way in which we ad- 
justed the quark-quark scattering cross section 
dé/di, many predictions of the black-box approach 
did not depend sensitively on it and were in agree- 
ment with experiment. As discussed in the sum- 
mary of FFF, the conclusions that did not depend 
strongly on the precise value of the internal trans- 
verse momenta of the quarks with hadrons were 
the most successful. They include: 

(1) Predictions for the large-p, single-particle 
ratios were quite successful. The 1*/1~ (and 
K* /K~) ratios were predicted to become larger 
at high x, since in this region the constituents 
that collide are predominantly u quarks that frag- 
ment more often into 7* (K*) than 77 (K7). Simi- 
larly, the mixing of the 7 meson implied that n/7° 
should be about 3 at high x, also in agreement with 
data. The high-x, K*/m* ratio was used to deduce 
that D‘*(z)/DZ"(z) must be about 4 at large z. 
Recent lepton data are consistent with this de- 
duction.*4 

(2) The number of 7°s*> and jets*®~** produced 
with a pion or proton beam is consistent with the 
expectation that the quarks in a pion carry, on the 
average, more momentum than they do in a pro- 
ton. 

(3) The quark-scattering approach predicted 
that the cross section for producing a jet of had- 
rons is considerably larger than that for produc- 
ing a single meson at the same ,.°° For example, 
the model predicted a jet (quark) to single particle 
(7°) ratio of about 370 at x, =0.4 and 6,,, = 90°. 
The expectations were in good agreement with 
subsequent jet trigger experiments.°71° 

(4) A distinctive feature of the model was that 
high-p, particles are not isolated but members 
of a cluster (jet) of particles representing the 
fragmentation of the quark. In single-particle 
triggers, one expected to see the remainder of 
the jet as an enhancement of associated particles 
in roughly the same direction as the trigger. With 
one additional assumption?! about the character 
of fragmentation, the number of particles accom- 
panying a large-p, trigger could be successfully 
understood. 

(5) There is now considerable experimental 
support for the overall four jet structure shown 
in Fig. 1 for a large-p, event. These agreements 
between the quark scattering approach and experi- 
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FIG, 1, (a) Illustration of the four-jet structure re- 
sulting from a beam hadron (entering at left along dotted 
line) colliding with a target hadron (entering at right 
along dotted line) inthec.m, frame: two jets with large 
P; (collection of particles moving roughly in the same 
direction), one called the “toward” (trigger) side and 
one on the “away” side; and two jets with small p, that 
result from the breakup of the beam and target hadrons 
{usually referred to as the “soft hadronic” background), 
(b) Illustration of the underlying structure of the large- 
Db, process A+ Bh, +h, +X, The large-/. trigger had- 
ron /, oceurs as the result of a large-angle scattering 
of constituents (94+ %)-" %+4%4), followed by the decay 
or fragmentation of constituent ¢ into a towards~side 
jet of hadrons (one being the trigger %,) and constituent 
d into an away~side jet of hadrons (one being ’,), The 
quantities %, %», #:, #i» are the longitudinal fraction 
of the incoming hadrons 4, B momentum and perpendic- 
ular momentum of constituents 4, 5 and 2%, 2 Ric, Rie 
are the fraction of the outgoing constituents longitudinal 
momentum and perpendicular momentum carried by the 
detected hadrons / and hy, 


ment indicate strongly that quarks play an impor- 
tant role in the production of high-p, mesons. In 
particular, they show that the mesons responsible 
for high-p, triggers probably arise from quarks 
that have fragmented in a manner similar to that 
observed in lepton-initiated processes. 


B. Failures 


As noted in the Introduction, the quark scatter- 
ing black-box model does not agree in detail with 


all the results of high-p, experiments. It disagrees 
with data in the following ways.** 

(1) Large-p, events are far less coplanar than 
first expected from a two-to-two scattering sub- 
process as shown in Fig. 1. Our first guess*** 
that (R1),.¢*(Ri)g+2= 330 MeV resulted in a too 
narrow away-Sside P,,, distribution. Even our 
final choice in FFF of (4,),.,= 500 MeV yields 
more coplanarity than seen in recent experiments. 

(2) The quark-quark scattering model predicts 
too many high-p, particles on the away-side of a 
large-p, trigger. For example, the number of 
away hadrons with z, > 0.5 (Ref. 42) for a pi 
=4.5 GeV/c trigger at 6=45° and W=53 GeV is 
predicted in FFF to be 3-4 times larger than 
seen experimentally by the CCHK group.'® 

(3) Not only does the model predict too many 
away-~-Sside particles, it predicts many more posi- 
tives than negatives on the away side. Fora 
trigger p, of 3.0 GeV/c, Ocm,=90° and W=53 Gev/ 
c, the model predicts about 50% more positives 
than negatives on the away side with p, (away) 
>1.5 GeV/c. Recent data from the CERN ISR 
(Ref. 43) show about equal away-side positives 
and negatives under these ciruumstances. These 
are serious problems for the model. The last two 
imply that (at the small-x, values probed by ISR 
experiments) the recoiling away-side parton does 
not fragment in a manner similar to that observed 
by lepton experiments. Furthermore, as dis- 
cussed in FFF Table 4, we cannot simply increase 
(k.)_+qto improve our agreement with (1) and (2). 
The value of 500 MeV was as large as we could 
take in FFF without spoiling our agreement with 
the energy dependence of the single-particle cross 
section. We feel that it is not useful at present 
to try to “fiddle” the quark scattering model to 
make it agree with recent experiments, particular- 
ly since there is an emerging candidate theory of 
strong interactions that apparently has the fea-~ 
tures necesSary to repair the failures of the black- 
box model. 


Til. THE INGREDIENTS TO THE QCD APPROACH 
A. Effective coupling a,(Q*) 


The effective strong-interaction coupling constant 
falls logarithmically with increasing 4?, where @ 
is some characteristic momentum in a collision. 
In general, the effective quark-quark-gluon cou- 
pling is expected to have the form 

2k" 127 
#2 = ————____ 3,1 
LM) * 45 = Biraapline@ nyc? FH) 
where n, is the number of quark flavors (we use 
n;=4). The constant C represents corrections 
that, in general, differ from process to process 
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but can, in principle, be calculated (although this 


might be quite difficult in practice).** The quantity 


A is an unknown scale factor that can be deter- 
mined from the amount of “scale breaking” ob- 
served in a given experiment. Analysis of the 
scale breaking in ep and yup collisions indicates 
that A is in the range 0.3 to 0.7 GeV/c (with C 
= 0).27720 

For ep collisions, @ is the four-momentum 
transfer from the electron to the quark. On the 
other hand, the correct kinematic quantity to use 
for @ in the constituent subprocess shown in Fig. 
1 is not known. This problem is, of course, re- 
lated to the unknown # C in (3.1). For definiteness, 
we will take C=0 and choose 


@ =2870/(5? +2? +02), (3.2) 


where §, ?, and @ are the usual Mandelstam s, ¢, 


and u invariants but for the constituent subprocess. 


This form for @ is symmetric in §, ?, and a and 
approaches —? in the case ? «§. This uncertainty 
in the form for @ and, correspondingly, the lack 
of knowledge of + C makes predictions at low @ 
{i.e., low p,) in hadron-hadron collisions a bit 
uncertain. 

If the distribution of quarks within the proton, 
G,..4(*), and the fragmentation of quarks in had- 
rons, D(z), both scale, then the invariant cross 
section Edo /d*p for producing a large-p, meson 
reflects directly the energy dependence of the 
quark-quark cross section di/d?. Thus if the 
latter behaves as A(?/S)/§", then the former be- 
haves as f(%,, 9cm.)/p.2".. The cross section for 
the scattering of partons in field theory (see Table 
I) with a,=constant yield 27 = Ney =4 where we 
define 


Nett = —In(o,/0,)/In(P1,/P12) , (3.3) 


where 0,, is the invariant cross section (at fixed 
x,) at ~i,,. Including an a, that depends on @ 
according to (3.1) and (3.2) produces an Eda/d*p 
that decreases faster than 1/p,4 at small p, 

(Nop %4.8 for 2<p, <10 GeV/c at x, =0.2) but 
approaches the 1/p,‘ behavior at large p,. This 
can be seen by the dot-dash curve in Fig. 2 where 
we plot p,* times the predicted Edo/d’p arising 
from quark-quark scattering at x, =0.2 versus 
p, using A=0.4 GeV/c. One sees that including 
the a,(@?) dependence brings one a small way 
toward the flat (1/p,°) dependence seen experi- 
mentally at p, 6 GeV/c. 


B. The quark and gluon distributions G(x, Q?) 


In the QCD approach, the “effective” quark dis- 
tributions in a proton, G,_.,(*, 9), do not scale. 
The influence of this on vW,(x, @?) for ep and yup 
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FIG, 2. The behavior of p.°times the invariant cross 
section Edo/d3p for pp 79+X at Om.= 90° and = 0.2 
arising from the QCD subprocess 4+4— 4+4 calculated 
with A=0.4 Gev/c, If the quark distributions within pro- 
tons, Gp-q (x), and the quark fragmentation functions, 
D?° (z), scale and if the strong interaction coupling @s 
is constant, then 2,°£dc/d%p behaves like ?.! at fixed 
X, and 4m, dotted line). Allowing 0 to depend on Q? 
according to (3,1) yields the dot-dashed curves. Includ- 
ing the expected Q? dependence os (@?) and Gpiq (x, 9) 
results in the dot-dot-dashed curves, Finally, allowing 
a, (Q2), Grig (x, Q2) and DI” (z, °) all to vary with @ ina 
manner expected from QCD results in dashed curve and 
the solid curve is the result after smearing with (#1), 
= 848 MeV and ‘%,),_,=439 MeV. The shaded area re- 
presents uncertainties due to the way in which one cuts 
off the low $ and f singularities in d°/df, Above Pi 23.5 
GeV/c at 6,,,=90°, the results are insensitive to the de- 
tails of this cutoff procedure, 


scattering has been studied!’~?° and may account 
for the lack of scaling seen in these experiments 
over the range 4.05 Q?5 20.0GeV”?. Weuse the form- 
ulation of Ref. 18 to extrapolate these functions to 
the higher- Q@* region needed in analyzing high-p, 
hadron data (see Table I for the range of Q@ sam- 
pled). 

In an asymptotically free field theory, scale 
violations are generated by gluon corrections 
typified by gluon bremstrahlung from a quark and 
by quark-anciquark pair creation by a gluon. One 
can predict the behavior of the constituent dis- 
tributions G,(x, 9?) given that they are known at 
some reference momenta Q, [large enough so that 
a,(Q,?) is small enough to make perturbation theory 
applicable]. Following Ref. 18, the moments of 
the distributions 
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TABLE II. The mean values of 2,, @,, and Q? resulting on A) and i corresponds to the constituent types 
for pp +1°+X at 90° in the QCD approach with A= 0,4 (u, d,s, c,#,d,5,7, gluon), The final resulting dis- 

GeV/c, and where z, is the fraction of the constituent mo- tributions at @? are calculated by inverting (3.4) 


menta carried by the trigger hadron (see Fig. 1); Q, is 
the component of momentum of the constituent scattering 
toward the trigger (Q,=k +m, see Table I and Fig. 2 of 
FFF); @? is defined by Eq. (3.2). 


by an inverse Mellin transform [Eq. (13) of Ref. 
18]. 

Figure 3 shows the predicted @? behavior of 
vW,(x, Q?) resulting from an analysis of the ep 


Wi(GeV) P, (GeV/c) (,) @,) (GeV) <Q”) (Gev’) and up data. The x dependence of the parton dis- 
tributions at the reference momentum, G;(%, Q,? 
53 2.0 0.68 1.25 7.9 =4 GeV’), was chosen to agree with experiment. 
53 4.0 0.75 0.86 33.7 


Unfortunately, the analysis of ep and yup is rela- 


e sed sass ee tively insensitive to the input gluon distribution 

19.4 1.94 0.74 153 48 with the proton. We take xG,(x, Q,”) =xG,.,.(%, 9") 

19.4 3.0 0.86 1.77 7.6 =(1 +9x)(1 ~ x)? at the reference momentum. It 

19.4 4.0 0.89 1.76 15.5 integrates to a total momentum for gluons within 

19.4 6.0 0.92 1.46 42.9 the proton of 50%. The resulting Q? dependence of 

19.4 7.0 0.94 1.46 58.5 Gp .¢(%, Q?) is also shown in Fig. 3. Both vW,(x, Q?) 
500 10.0 0.58 0.16 525 and xG, .,(*, Q°) exhibit a rise at small x anda 
ns bree se ee ae decrease at large x as Q@ increases. The effect 
1000 30.0 0.61 0.10 4184 is particularly dramatic for the latter. 

The QCD interpretation of the ep and yup in- 
elastic-scattering data has some ambiguities be- 
cause one expects, not only the logarithmic scale 

1 breaking shown in Fig. 3, but also other correc- 
Mi (n, O) = f x"Gi(x, @?)dx (3.4) tions falling more rapidly with @?. The latter 
8 would be unimportant at very large @ 250 (GeV/ 
are given in terms of the moments at @, by c)? but are important in the ? range probed by 
9 the current data. One example of such a correc- 
Mn, @) = > Mn, Qo) Ri fn, Q?, Q?), (3.5) tion is the O(m?/Q?) correction (m is proton mass) 
ted generated by using x’ and not x as the argument 
where Rj ,(n, @, Q,”) is a known matrix (depending of the structure functions. Here 
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FIG, 3, (a) The predicted Q* dependence (scale breaking) of the electroproduction structure function for the proton 
“W(x, Q°) arising from the constituent (quarks, antiquarks and gluons) distributions G,(x, @2) used in this analysis. The 
distributions at high @? are calculated from the distributions at the reference momentum Q,?=4 GeV? using a QCD mo- 
ment analysis with A=0.4 GeV/c, In asymptotically free theories, one expects a decrease in the number of high-* con- 
stituents and an increase in the number of low-* constituents as Q? increases, Also shown is the value of vIV,(x) (in- 
dependent of Q?) used in the quark—quark black box model of FF1. (b) The predicted @ dependence of the distribution 
of gluons within the proton xG, ~,(x, 2) = x(x, @) used in this analysis. The distribution at high Q2 is calculated in 
terms of a distribution at the reference momentum Q,?=4 GeV" given by *g(*, Q,2) = (1+ 9x)(1 — 2)4, 
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This leads to scale breaking that is about half the 
observed amount in the large x2 0.3 region. How- 
ever, one can construct variables of a similar 
type (for instance %super introduced by Atwood") 
that can describe all the observed large-x scale 
breaking in terms of O(m?/Q?) corrections. Such 
models give essentially no scale breaking at low 

x and the important feature of the QCD approach 
is that the same value of A describes the scale 
breaking at both low and high x. This is shown 

in Fig. 4. It should be noted that the fit to the low- 
x data is poorer than that at high x although the 
trend with @ is given well in both cases. This is 
because the low-.x data is new data that was not 
available when the parameters of the QCD solution 
were determined.*® In Ref. 18, we not only con- 
sidered structure functions that were a function 

of x but also a more complicated and probably 
more realistic formalism developed by Georgi 
and Politzer.'’? This includes O(m?/@) terms 

not only in the argument of the structure functions 
{similar to (3.6) above] but also as overall multi- 
plicative factors. It turns out that the O(m?/Q?) 
terms tend to cancel among themselves and this 
formalism is phenomenologically equivalent to the 
simple formulation we use here. In particular, 
essentially the same value of A ~0.5 GeV/c is 
found in the best fit of both formalisms. 

If one includes the scale-breaking effects of 
Gi{x, Q) in addition to the running coupling con- 
stant a,(@), the resulting pp-7°+X cross section 
arising from the quark-quark subprocess has an 
Nee eQual to about 5.0 and 5.5 for the range 2.0 
<p, <10.0 at x, =0.2 and 0.5, respectively. The 
scale breaking of G,(x, Q’) has little effect at x, 
=0.2 (see Fig. 2) because at this x, one is sensi- 
tive to G,(x, @) near the values of x that are 
stationary as @ increases. 


C. The fragmentation functions Df (z, Q?) 


The experimentally measurable constituent 
fragmentation functions Dj(z, Q°) (here @ refers to 
a gluon or a u, d,s, c,a,d,5, f quark) are ex- 
pected, in asymptotically free theories, to show 
scale breaking (@ dependences) similar to that 
predicted for vW,(x, @?).2* 47 The moments of the 
fragmentation functions to a given hadron / given 
by 


MXn, =f 20h, @)dz, (3.7) 


are given in terms of the moments at some ref- 
erence momentum Q, by an equation similar to 


—— QCD A= 04 GeV/c 
--- QCD A= 05 GeV/c 


0.180, 


& 

ot a= L 1 1 coat 

4 8 6 24 32 40 
a? [(Gew/c)*} 

FIG. 4, Comparison of the scale-breaking effects (Q? 
dependence) expected from an asymptotically free theory 
with data on ep and '-p inelastic scattering at *=0,033 
and 0,08 (Ref, 20) and at = 0.5 (Ref. 72), The theory 
comes from the analysis of Ref. 18 using A=0.4 GeV/c 
(solid curve) and A=0,5 GeV/¢ (dashed curve). 


(3.6). Namely, 
9 
Mj(n, @) = 2a Mi(n, Q7Ri ln, %, °), (3.8) 


where the matrix R;,; is simply related (but not 

equal to) R;,.** One then uses the Mellin-trans- 
form technique of Ref. 18 to invert (3.7) and ob- 
tains D?(z, Q*) in terms of these functions at the 
reference momentum Q,? (we take @,?=4 GeV’). 


746 


18 QUANTUM-CHROMODYNAMIC APPROACH FOR THE... 3327 


As explained by Gross,*® the Q@? dependence of a 
distribution function depends on its shape at the 
reference momenta. The faster the function 
D(z, Q,”) falls off with increasing z at large z, the 
faster the large z points fall as @ is increased 
above Q,?. It is important when considering the 
fragmentation functions to distinguish the dis- 
tribution of primary (or direct) mesons from the 
final net distribution (which includes decay prod- 
ucts). The above moment analysis should be ap- 
plied to the former not the latter. This is, of 
course, a bit difficult since we do not know ex- 
perimentally exactly how many resonances are in 
the quark jets at Q,?. What we shall do is to use 
the results of Ref. 27 (hereafter called FF2) at 
the reference momentum 4,’ =4 GeV’. The dis- 
tribution of primary mesons at Q,” is then given 
by 


D'(z, Q,2) =Alf(1- 2) +B"F (2), (3.9) 
with A* and 5" given in Table I of FF2 and 

F(z)=F(z)-f(1-2) (3.10) 
with 


F(1- 2)=f(n)=1-4+3an?’. 


The parameter a is chosen to be 0.77 and F(z) 

is given by Eq. (2.23) in FF2. The distribution of 
primary mesons in a gluon jet at 9,” is assumed 
to have the form 


D3 (2, @,°) = BF (2) , (3.11) 
where we arbitrarily take 
F(z) =3(1 - 2)?/z, (3.12) 


where we have assumed, as discussed earlier, 
that the gluon fragmentation function falls off 
faster with increasing z than do the quark func- 
tions. 

We take the primary mesons precisely as ex- 
plained in FF2 as being either pseudoscalars 
(7, K, etc.) or vector mesons (p, K*, etc.) with 
equal likelihood. We use the QCD moment method 
to calculate the primary meson (pseudoscalar plus 
vector meson) distributions at any desired Q”. The 
resonances are then allowed to decay and we form 
the total net (direct + indirect) distributions at that 
@. Typical results are shown in Fig. 5. 

The effect on the predicted large-p, invariant 
cross section of including scale violations of the 
fragmentation functions is that the N,,, now be- 
comes 5.8 and 6.4 between p, =2 and 10 GeV/c at 
v.=0.2 and 0.5, respectively. Large-p, results 
are particularly sensitive to scale violations of 
the D(z, Q*) function since these violations are 
largest at high z (see Fig. 5) and this is precisely 
the region sampled by the calculations. 


zD(z,Q2) VERSUS z 


(b) giuon 7° A=0.4 
Qt=4 


lo"! 


10 


103 


4 1 rise stasis s 
1006 02 04 06 08 00 02 04 06 08 10 


FIG. 5. The Q dependence of the fragmentation func- 
tion for a “ quark to a 7, D!°(z,Q), expected from an 
asymptotically free theory, The distributions at high 
@ are claculated from the distribution at the referenge 
momentum Q,?=4 GeV? using A=0.4 GeV/c, where D, 
(2, Q,°) is taken from the analysis in FF2. (b) Same as 
(a) but for the gluon fragmentation function D(z, @). 


D. Transverse momentum 


As we learned in FFF, constituents have a large 
internal transverse momentum inside the proton. 
Such effects (called smearing) are particularly 
important for large-p, calculations, due to the 
‘trigger bias” which selects the configuration in 
which the initial quarks (or gluons) are already 
moving toward the trigger!!~'5*! (see Fig. 3 of 
FFF). In QCD, this transverse momentum of the 
partons can arise from two sources illustrated 
in Fig. 6. 

Firstly, in a proton beam, quarks are confined 
in the transverse direction to within the proton 
radius. Therefore, from the uncertainty princi- 
ple, they must have some fransverse momentum. 
This momentum is intrinsic to the basic parton 
“wave function” inside the proton. As illustrated 
in Fig. 6(a), one might expect the wave function 
to have a term where the trigger parton k, is 
balanced by another constituent (or constituents) 
which has the opposite k, and most of the remain- 
ing longitudinal momentum. Consider now the 
plane formed by the beam, target, and a 90° trig- 
ger hadron (called the x-z plane in Fig. 6). Typ- 
ically, the trigger arises from the fragmentation 
of a constituent with £,,>0 which is balanced by 
the remaining constituents having k,,<0. One ex- 
pects to see this negative k,, as a shift in the 
beam and target jets at large |x, |. This shift 
(i.e., nonzero (k,,)) of the beam jet as one in- 
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(o) Type I: k, Intrinsic to Wovefunction 


a hngger Quork 
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.~-Bosic 2-*2 subprocess 
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Frogments ot 
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“—=> Beom ond Torget Jets <p,> #0 
---—» quorks 


~~~ gluons 


(b) Type IL: Effective” k, due to Bremstrahlung 
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| 
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Basic 2-™2 subprocess 
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a_i Beam ond Torget Jets <px> +O 


FIG, 6, (a) Illustration of the nonperturbative compo- 
nent of the transverse momentum of quarks within proton 
that is intrinsic to the wave function of the proton, One 
expects this transverse momentum to be balanced by the 
remaining constituents in the proton which can, in turn, 
fragment into particles at high %», The away-side con- 
sists of the recoiling quark 4% and two slightly shifted 
jets, one from the beam and one from the target. (b) 
Illustration of the perturbative component to the trans 
verse momentum of a quark with a hadron which is due 
to the bremstrahlung of a gluon before the basic 2~ 2 
scattering occurs. In this case, the trigger quark is 
balanced by two away-side jets, one from the quark % 
and from the radiated gluon @¢. 


creases the p, of a 90° trigger has recently been 
observed by the British-French-Scandinavian 
(BFS) group at ISR°° (see Fig. 14 or Ref, 22), 
Secondly, in QCD, one expects to receive an 
“effective” k, of quarks in protons due to the 
bremstrahlung of gluons. This perturbation term 
is illustrated in Fig. 6(b). It corresponds to in- 
cluding two particle to three or more particle 
processes (2 — 3) rather than just the two particle 
to two particle 2-2 scatterings. For such sub- 
processes, the k, of the quark @, is balanced by 
a gluon jet on the away side which subsequently 
fragments into many low-momentum hadrons, 
In addition, the mean value of the effective #, is 
expected to depend on the x value of quark ¢, and 
on the @ for the processes. Separating the origin 
of the transverse momenta into Types I and I as 
seen in Fig. 6 is a bit artificial since both mech- 
anisms occur simultaneously. 
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FIG, 7, The transverse-momentum spectrum, 
do/dMdYd*k,, of muon pairs in PP collistons at W= 27,4 
GeV, M,.=8 GeV, and rapidity Y=0 from Ref, 51, Also 
shown is a Gaussian fit of the form exp (— 0,54 4,2) 
which yields )u~1,2 GeV and is interpreted as imply- 


ing (), ,,=848 MeV. 


The effective constituent transverse momentum 
is directly observed in the Drell-Yan process 
pp- uw’ w7 +X. Current data’! indicate that 
(Ridu+y- iS about 1,2 GeV (see Fig. 7). There 
has been much speculation about how much of the 
dimuon &, spectra shown in Fig. 7 is due to the 
wave function (Type I) and how much is explained 
by QCD perturbation calculations (Type IT),?&9°73% 
The latter predicts a high-#, tail to the distribu- 
tion that falls roughly like a power and a mean 
that depends both on x and @ of the muon pair. 
For the present analysis, we shall parameterize 
the transverse momentum of the constituents in 
protons by a Gaussian with (#,),.,=848 Mev 
which produces for the Drell-Yan subprocess the 
curve shown in Fig. 7. We shall take this dis- 
tribution to be independent of x and @ and to be the 
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same for quarks, antiquarks, and gluons in the 
proton. In so doing, we are not handling properly 
the x and Q@ dependence of the high-k, tail ex- 
pected from QCD bremstrahlung, At a later date, 


we hope to calculate and include the 2-3 processes 


expected by QCD. For the present, we merely 
use the data in Fig. 7 to give us an “effective” k, 
distribution and include only 2 —2 subprocesses, 
In a manner similar to that illustrated in Fig. 
6, the emission of gluons after the hard-scatter- 
ing (2 —2) subprocesses induce an “effective” Rk, 
of the hadrons that fragment from the outgoing 
quarks because one is really seeing two jets 
rather than one, As for the quark distributions 
in the proton, we do not include these effects (we 
also neglect the interferences that arise between 
the amplitude for emitting gluon before and after 
the hard 2-2 process) and for the present take 
the transverse momentum distribution of hadrons 


from outgoing quarks (and gluons) to be a Gaussian 


with (2,)..,2439 MeV independent of z or @ as 
in FF2. Again, this is not precisely correct and 
we hope to improve this in later work, 


E. The cross sections dé/d? 


In the QCD approach, one includes not only the 
contributions from quark-quark scattering but 


also the contributions from quark-gluon and gluon- 


gluon scattering, We include all seven processes: 
44-44, T4-Td, TI -T4, §4- 84, BI- BF, BE 
~%q, a- gg, and gg-gg, where g is a gluon, 
Each 2-2 differential cross section, di/d/?, is 
calculated to first order in perturbation theory 
with an effective coupling constant o,(@*) as in 
(3.1), These cross sections have been calculated 
previously by Cutler and Sivers?* and by Com- 
bridge, Kripganz, and Ranft?® and for complete- 
ness are given in Table I.** All these cross sec- 
tions behave as $~? at fixed //§ (and for constant 
a,) so that including gluons does not help in 
changing Nerr from 4 to 8 but gluons are important 
in increasing the magnitude of the low-x, cross sec- 
tion to agree with data.'*?*® In addition, we will 
see that gluons play an important role in under- 
standing the high-/, correlation data. 

Including gluons, unfortunately, introduces an 
uncertainty (at low *,) in the high-p, predictions. 
As explained in Secs. IIIB and HIC, the gluon 
distribution in a proton and the gluon fragmenta- 
tion functions are essentially unknown. We only 
know that the total momentum carried by quarks 
and gluons in a proton is that of the proton and 


Similarly the total energy carried by all the hadron 
fragments from a gluon is the gluon energy. Many 


Of our high-/p, predictions depend on these un- 
known distributions, for the QCD quark-gluon and 
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gluon-gluon scattering cross sections are large, 
If one accepts QCD as the correct description for 
high-p, processes, then one could eventually hope 
to use the hadron-hadron data to help determine 
these functions, For the present, we simply cal- 
culate with our initial guesses for G,_.,(x, @,”) and 
Di(z, Q,2) and do not attempt to find the optimal 
forms for these functions by fitting high-p, data. 
As discussed in great detail in FFF, cross sec- 
tions of the type given in Table I are not adequate 
once one allows for a nonzero transverse momen- 
tum of constituents in the hadrons or of hadrons 
from in the outgoing jets.** This is because they 
diverge at §, %, or f equal zero which can occur 
once (R,),., or (Rig. iS nonzero. To remove 
this unwanted singularity in the integral, we sim- 
ply replace §, ?, and “in Table I by §+M,2, M,? 
-t, and M,?-u, respectively, with M,?=1.0 
Gev’.** Because we are generating the transverse 
momentum of the constituents by a Gaussian, the 
results for hadron production at large p, are not 
sensitive to this ad hoc cutoff procedure, With the 
large-k, damping, characteristic of a Gaussian, 
once one has removed the infinity at say / =0 (by 
whatever means), one never samples this low-/ 
region when calculating high-p, meson production, 
This would not be true for a transverse-momen- 
tum distribution falling off less rapidly with k, 
(for example, like a power) so that the large-k, 
tails expected by the QCD processes in Fig, 6 
will have to be handled differently in the future. 
Because (k,),., is large, one does begin to be - 
come sensitive to the low-§ and -/ cutoff at low 
P.. This is illustrated in Fig. 2 where we show 
the results for pp—7°+X at x, =0.2 and Oem. 
=90° arising from quark-quark scattering before 
and after smearing. The shaded area represents 
uncertainties due to varying the low-§ and -/ 
cutoffs. The region of p, <3.5 GeV/c cannot, at 
present, be used to quantitatively test the QCD 
ideas since this region is sensitive to the cutoff 
procedure.*®> For p, 23.5 GeV/c, on the other 
hand, the manner of cutoff is not important and 
the result depends only on the amount of smear- 
ing (i.e., on (Ri)y_,)- 


IV. RESULTS 
A. The singte-particle cross section 
I. pe behavior 


Figure 8 shows a comparison of the predicted 
and experimental behavior ot p,° times Edo/d3p 
for pp—7+X at Ocm.=90° and x, =0.2, 0.35, and 
0.5 versus p,. The dot-dashed and solid curves 
are the final results before and after smearing, 
respectively, with A =0.4 GeV/c. The dashed 
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FIG. 8. The data on Pitimes Edo/d’p for large P, pion 
production at &m.= 90° and fixed x,=0.2, 0.35, and 0.5 
versus P. (open squares: Ref, 73, solid dots: Ref. 74, 
crosses: Ref, 75) compared with the predictions (with 
absolute normalization) of a model that incorporates all 
the features expected from QCD, The dot-dashed and 
solid curves are the results before and after smearing, 
respectively, using A= 0.4 GeV/c and the dashed curves 
are the results using A-0.6 GeV/c (after smearing). 


curves are the results (after smearing) using 
A=0.6 GeV/c. The effect of smearing (at fixed 
x,) is to increase the low-)p, predictions (by about 
a factor of 10 at p, =2 GeV/c and x, =0.2) while 
not affecting much the high-p, region. For the 
range 2.0 <p, <6.0 GeV/c at x, =0.2, and 4.0 

<p, <10.0 GeV/c at x, =0.5, the results are 
roughly independent of P, (when multipled by p,°). 
However, this p,~° behavior is only a “local” ef- 
fect. It holds only over a small range of p, (at 
low ~,); the region depending somewhat on ~,. 

As fp, increases, the predictions approach the ex- 
pected ~,~* behavior. This can be seen more 
clearly in Fig. 9 where we plot the predictions and 
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FIG. 9. The same as Fig. 8 except we now plot P.! 
times Edo/d*p versus ~, at %1=0,2 and &,, = 90°, One 
clearly sees the asymptotic approach to an Edo/d'p a 
Py behavior at fixed *., 


data times p,* at +, =0.2 and 6, =90° The be- 
havior becomes ~,~‘ only asymptotically, but by 
Pp, =10 GeV/c at this x,, it is fairly close (about 
p.). 

As illustrated in Fig. 2, the low-~, region is 
sensitive to the small-§ and -/ cutoff employed.®® 
However, because of the Gaussian falloff of the 
transverse momentum distributions, the results 
are completely insensitive to the form of the cutoff 
for p, 23.5 GeV/c at 6-m.=90°. For example, 
Table II shows that the constituent subprocess 
has a mean momentum (@,)=1.76 GeV/c for a 
bp, =4 GeV/c trigger at W=19.4 GeV but even for 
this large “trigger bias”, only 12% of the total 
bp-—1°+X cross section arises from the region 
|?|<10 GeV? and none arises for l?]<5 Gev?. 

The data on Edo/d'p at fixed W=19.4 and 53 
GeV versus Pp, are compared with the theory in 
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FIG;10, Comparison of a QCD model (normalized 
absolutely) with data on large-P. pion production in pro- 
ton-proton collisions as W= Vs =19,4 and 53 GeV/c 
with 9m,.= 90° (open squares: Ref. 73, solid dots: Ref. 
74, crosses: Ref, 75, solid triangles: Ref. 76, open cir- 
cles: Ref. 77), The dot-dashed and solid curves are the 
results before and after smearing, respectively, using 
\=0.4 GeV/c and Dy og = 848 MeV and the dashed curves 
for \-0,6 GeV/c (after smearing), The contribution 
arising from quark-quark, quark-antiquark, and anti- 
quark-antiquark scattering (i.e., no gluons) is shown by 
the dotted curves (after smearing). 


Fig. 10. The agreement is quite remarkable. 

It is almost as good as the black-box model (Fig. 
13 of FF1) where we chose the behavior of d5/d? 
and the normalization to fit the data. The results 
before smearing are also shown (dot-dashed 
curves), Smearing has little effect for Pp, =4.0 
GeV/c at W’=53 GeV but has a sizable effect (even 
at p, =6.0 GeV/c) at W’=19.4 GeV due to the steep- 
ness of the cross section at this low energy. The 
contributions to the total invariant cross section 
from quark-quark elastic scattering (plus 77 

- 9G and 77 ~ 77) are shown in Fig. 10 (dotted 
curve), Gluons make important contributions to 
the cross section at small x, (x, <0.4). 

We cannot at this time say whether the slight 
disagreement in the normalization of the theory 
seenin Figs. 8, 9, and10 at low, (abouta factor of 
2 at p,=2 GeV/c and W =53 GeV) is significant. 

At these low values of x, and p,, the theory cannot 


at present be calculated precisely as the results 
depend sensitively on the unknown gluon distribu- 
tions, the shape of the transverse-momentum 
distributions of the quark within the hadrons, the 
nature of the low-S and -f cutoff, our choice for 
@?, and possibly higher-order corrections [such 
as the +C in Eq. (3.1)]. It may be that all of the 
invariant cross section down to p,’s as low as 

1.5 or 2.0 GeV/c is due to the scattering of quarks 
and gluons as described by QCD. On the other 
hand, it may be that other nonleading constituent 
subprocesses such as the ones estimated by 
Blankenbecler, Brodsky, and Gunion®® make some 
contributions in the range of 1.5< p, <4.0 GeV/c 
with QCD dominating at the higher p,’s. 


2. Angular dependence 


In FF1 we used the data shown in Fig. 11 to de- 
duce that dé/d? «1/8? was the preferred form 
for the angular dependence of quark-quark scatter- 
ing. We must now see if the QCD predictions are 
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FIG. 11. Data on the *, dependence of the invariant 
cross section for Pb 1° +X at W= 53 GeV and f1=2.0 
and 3.3 GeV/c (solid dots: Ref, 74, solid triangles: Ref. 
78, solid Squares: Ref. 79) compared with the results of 
a QCD model with A= 0.4 GeV/c (solid curves). At these 
low-P, values, the predictions are sensitive to the low ¢ 
and f cutoff of 47/df, At any fixed ~., the uncertainty 
due to the cutoff procedure (illustrated by the shaded 
areas) is greater at large %1, Also shown (dashed 
curves) are the results from the quark-quark black-box 
model of FF1 which was adjusted to fit these data. 


3332 R. 


pp aw*t+ X versus xy, 
10 T T T Ic ee 
p, = 2.0 GeV/c W =: 53 GeV 
QcD A=0.4 


Eda/d*p [ nb/GeV?} 
ro) 


Total 
ae gt+q-~qtg 
io 2b ---—q+g~q+g 
tee gGtgegtg 


103 
00 0.2 0.4 0.6 08 


xy 


FIG. 12. Contributions to the total Edo/d3p for pp—1* 
+X arising from the various QCD subprocesses, 9+ 4977 4 
+4, +8 4+8, and +8 £+£ at W=53 GeV, Pi =2.0 
GeV/c versus %. 


consistent with the same data. Figure 11 shows 
the QCD results with A =0.4 GeV/c (after smear- 
ing) for the x, dependence of Edo/d*p for pp 
~m*+X at p, =2 and 3.3 GeV/c and W =53 GeV. 
As noted earlier, the predictions are a bit low 
at this low p,; however, the QCD x, dependence 
is not in gross disagreement with the data. Un- 
fortunately, comparison with these low-p, data 
is not very significant. First, with the large value 
of (2y),., we are now using, the results for these 
Pp, values are sensitive to the manner in which we 
remove the singularity in do/d? at é (or %) equal 
zero. This is particularly true at large x, where 
the constituent scattering occurs at small f (or 
#) values. We have tried to indicate this un- 
certainty by the shaded region in Fig. 10. Secondly, 
since this region is so sensitive to smearing, 
the results depend on our assumption that (21), 
is independent of the x, of the quark g. Modifying 
this assumption could change the resulting +, 
dependence. 

Furthermore, in the QCD approach, the x, de- 
pendence of the invariant cross section does not 
directly reflect the angular dependence of d/d?. 
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Figure 12 shows that the various QCD subprocesses 
have differing x, dependences so that the result 
depends not only on the various d3/d? but also 

on the amount of each term. For example, the 
gluon-gluon scattering contributions fall off fast 
with increasing *,. We could cause the predicted 
Eds /d'p to fall off more rapidly with x, by in- 
creasing the amount of gluon-gluon scattering [by 
changing the relatively unknown functions 

Gp (x, @) and DI"(z, @)]. 


3. Particle ratios 


As mentioned earlier, and as shown in Fig. 10, 
gluons make an important contribution to the 
single-particle invariant cross section at low x,. 
However, since the gluon fragmentation function 
has been chosen to be considerably smaller at 
large z than the quark fragmentation function, an 
experiment demanding a large-p, meson trigger 
is “biased” in favor of the toward-side constituent 
being a quark rather than a gluon. Table III gives 
the fraction of the single-particle cross section 
arising from the various combinations of toward 
and away constituents. At W’=53 and p, 24.0 
GeV/c, 45% of the 7° cross section arises froma 
toward-side quark having scattered off a recoiling 
gluon while 27% arises from the quark recoiling 
off another quark (a total of 72%). Tables III and 
IV show that for x, 2 0.3, the toward constituent 
for single-particle triggers is almost always a 
quark. This quark, however, scatters off both 
quarks and gluons in the other proton so that the 
recoiling constituent is quite often a gluon. At 
W =53 GeV, a m° trigger with p, =4.0 GeV/c, the 
toward constituent is a quark (or antiquark) 72% 
of the time while the away constituent is a gluon 
62% of the time, 

This “bias” for quarks rather than gluons in 
single-particle triggers means that the predictions 
for ratios of different kinds of particles will not 
be very different from those of the quark-quark 
scattering black-box approach. As shown in Fig. 
13, pp~(m'/n7)+X ratio predictions of QCD are 
smaller due to the contamination from the gluon 
decays (gluons fragment into equal numbers of 
positives and negatives). They are in equally good 
agreement with data. The QCD particle ratios do 
not “scale” as did the FF1 results (i.e., they are 
a function of x, and W at fixed 6,,,). The curve 
displayed in Fig. 13(a) is calculated at W=19.4 
GeV and increases slightly as W increases (by 
about 20% in going from W=19.4 to 53 GeV/c). 


B. The jet cross section 


The “bias” in favor of toward-side quarks does 
not occur when one triggers on jets rather than 
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The 


Fraction of the total pp ~7°+X 90° invariant cross section arising from various combinations of toward-side (trigger) and away-side partons. 


Results are shown for the cases where the 7° trigger came from a quark or gluon and the recoiling constituent (away-side) is a quark or gluon or either (sum). 


TABLE Il. 


notation is “toward~~ away” and “quark” refers to q+. 


Toward-side gluon 
gluon+— gluon 


Toward-side quark 
quark + gluon 


gluon +~ anything 


gluon quark 


quark + anything 


quark ~~ quark 


Xe 


P, (GeV/c) 


W (GeV) 


0.43 
0.28 
0,13 
0.07 
0.46 
0.19 
0.11 
0.03 
0.44 
0.24 


0.28 
0.17 
0.06 
0.03 
0.31 
0.11 
0.05 
0.01 
0.29 
0.11 


0.15 
0.11 
0.07 
0.04 
0.15 
0.08 
0.06 
0.02 
0.15 
0.13 


0.57 
0.72 
0.87 
0.93 
0.54 
0.81 
0.89 
0.97 
0.56 
0.76 


0.41 
0.45 
0.48 
0.48 
0.36 
0.54 
0.52 
0.42 
0.32 
0.42 


0.16 
0.27 
0.39 
0.45 
0.18 
0.27 
0.37 
0.55 
0.24 
0.34 


0.08 
0.15 
0.26 
0.34 
0.20 
0.31 
0.41 
0.62 
0.04 
0.12 


2.0 
4.0 


53 
53 


7.0 
9.0 
1.94 
3.0 


4.0 


53 


53 


19.4 


19.4 


19.4 


6.0 
10 
30 


19.4 
500 
500 
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on single particles and, as can be seen in Table 
IV, gluons make up a sizable fraction of the total 
jet cross section. With our guesses for the gluon 
distributions, gluons are responsible for 73% of 
the jet triggers at p,=4 GeV/c, W=53 GeV, and 
Ocm. =90°. Even at higher-x, values such as p, 
=6.0 GeV/c, W=19.4 GeV, &m.=90°, gluons still 
make up 45% of the jets. Because of the presence 
of gluon jets and because the quark fragmentation 
functions D*(z, Q*) are smaller at high z due to 
scale-breaking effects, the jet to single-7° ratio 
is now predicted to be larger than it was for the 
quark-quark black-box approach. This is seen in 
Fig. 14 where we plot the invariant cross section 
for pp ~ Jet +X divided by pp~ 1° +X at 6om,=90° 
versus %,. In the QCD approach, this ratio no 
longer “scales,” It is a function not only of x, 
(at fixed @m.), but also of W. We show results 
for W=19.4, 53, and 500 Gev. 

As noted in Ref. 57, the quark scattering model 
in FF1 and FFF agreed quite well with the jet 
cross section observed experimentally at W 
=19.4 GeV and 3 <p, <6.0 GeV/c. We might now 
be concerned that the QCD results for the jet cross 
section are larger than FF1 by a factor of about 
5 in this region. We have, however, previously 
been somewhat naive when comparing theory with 
experiment. What we show in Fig. 14 is the cross 
section for producing a quark (or gluon) with a 
given momentum (divided by the 7° cross section 
at the same momentum). However, as we noticed 
in Ref. 27, quarks of a given momentum (equal 
to their energy) cannot produce jets with the mo- 
mentum of all particles equal to the energy of all 
particles. Our jet model in FF2 gives E,,, — Pars 
~1.2 GeV for quark jets. Since the cross sec- 
tion for producing jets falls so steeply, the cross 
section for producing a jet with a given Peg, is 
considerably smaller than that for producing one 
with a given E,,,. As explained in Ref. 41, itis 
the former that is more closely connected to what 
is measured experimentally. At W=19.4 GeV/c, 
we estimate that the cross section to produce a 
jet where p, =5 Gev/c at 90° is about 10 times 
smaller than the cross section to produce a jet 
whose E,,, =5 GeV/c. If we correct for this effect, 
the new QCD prediction at W=19.4 GeV is within 
a factor of 2 of the old (incorrectly interpreted) 
FF1 results that appeared to agree so well with 
the data. 

The difference between £,,, and p,,,, of a jet 
arises, of course, from low-momentum particles 
that have energy due to their mass (or k,) but 
have little momentum p,. This is tangled with 
the experimental uncertainty in all hadron-jet 
experiments concerning low-~, particles. One 
cannot be sure that one is not losing the low-), 
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TABLE IV. Fraction of the pp ~7°+X and pp jet+X cross section at 6¢ m= 90° arising 
from the case where the toward-side (or trigger) constituent is au, d, or antiquarks Q=u+d 
+5 or agluon, Also shown is the fraction arising from the case where the recoiling or away 
side constituent fs a 4, d, or antiquarks Q or a gluon. 


Toward side Away Side 
W(GeV) P,(GeV/c) Trigger 4 Q@ Gluon 4 d Q Gluon 
53 2 fad 0.20 0.20 0.18 0.43 0.14 0.09 0.06 0.70 
53 2 jet 0.12 0.10 0.14 0.64 0.18 0.09 0.08 0,63 
53 4 7 0.36 0.25 0.09 0.27 0.23 0.09 0.05 0.62 
53 4 jet 0.12 0.09 0.05 0.73 0.20 0.08 0,02 0.69 
53 7 nr 0.55 0.26 0.05 0.13 0.28 0.14 0.05 0,53 
19.4 3.0 ad 0.46 0.29 0.06 0.19 0.20 0.11 0.03 0.66 
19.4 3.0 jet 0.13 0.08 0.04 0.74 0.21 0.08 0.02 0.68 
19.4 6.0 ad 0.73 0.24 0 0.03 0.42 0.15 0 0.43 
19.4 6.0 jet 0.37 0.17 0 0.45 0,36 0.14 0 0,50 
500 10 Fay 0.18 0.17 0.18 0.44 0.21 0.08 0.09 0.60 
500 10 jet 0.05 0.05 0.06 0.81 0,13 0.11 0.07 0.66 
500 30 rae 0.37 0.23 0.14 0.24 0.29 0.12 0.05 0.53 
500 30 jet 0.11 0.08 0.10 0.69 0.21 0.11 0.06 0.60 


jet particles that are not well collimated or gain- 
ing low-p, background from the beam and target 
jets in Fig. 1. Only by doing a very careful anal- 
ysis, including the precise acceptances of a given 
experiment, can one distinguish between the re- 
sults of FF1 and the new QCD approach in spite 

of their rather large differences. One might hope 
someday to distinguish experimentally between 
gluon and quark jets. The gluon jets are assumed 
to have a higher multiplicity of particles each with 
lower momentum on the average. In addition, un- 
like the quark jets discussed in Ref. 27, gluon jets 
will carry on the average no net charge (or 
strangeness, etc.). 


C, The toward-side correlations 


The jet physics, discussed in the previous sub- 
section, directly tests that particles at high p, 
are not produced singly but, rather, are members 
of a cluster. This aspect of constituent models 
can also be tested in single-particle triggers by 
observing the accompanying particles produced 
near the trigger (in phase space). Experimentally, 
one observes an enhancement of particles with 
high p, accompanying the high-p, trigger which 
was predicted correctly from the quark scattering 
model?**! together with an assumption about the 
double fragmentation function, D?'"2(z,, z,).?7 In 
the QCD case, the trigger hadrons usually come 
from quarks rather than gluons and, furthermore, 
the mean value of z, shown in Table II are similar 
to those in FFF.5* Thus the new predictions, as 


shown in Fig. 15, are, in fact, very similar to the 
quark scattering results. However, the QCD re- 
sults have an additional uncertainty due to our lack 
of knowledge of the gluon double fragmentation 
functions, D’"2(z,, z,, Q?). In Fig. 15 where 18% 
of the triggers come from gluons, we have as- 
sumed that the accompanying particles were just 
given by the fragmentation of a gluon with mo- 
mentum (1-2z,) times that of trigger gluon. This 
has the feature of giving the correct multiplicity 
for the produced hadrons; however, it is surely 
not exact. For example, fora m* trigger, one 
would expect to find more high-p, m~’s than m*’s 
in the accompanying particles. In Ref. 22, an 
upper estimate for the effect of the gluon decays 
was deduced by assuming that the accompanying 
hadrons came from a & (or @) jet carrying all of 
the remaining momentum. Comparing with the 
CCHK data, this upper estimate and the lower 
estimate gotten by dropping gluon term completely 
roughly bracket the data. 

In summary, we find that the QCD calculations 
have somewhat greater uncertainty than those in 
FFF coming from the gluon contribution. How- 
ever, both models are in good agreement with the 
same-side correlation data. 


D. Away-side correlations 


i, Away-side multiplicity n(z,) 


An important consequence of the QCD approach 
is that the number of away-side hadrons with 
large-p, (-~,) is predicted to be considerably 
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FIG. 13, (a) Comparison of the data (Refs. 73 and 80), 
of the 7'/1” ratio in PP collisions at %&m.=90° versus *: 
with the quark-quark scattering model of FF1 (which is 
independent of W at fixed .) and the QCD results using 
A\=0.4 GeV/c, The QCD results are plotted for W=19,4 
GeV and are not precisely independent of W. The 1/7 
ratio increases at fixed +, and 9,.,as W increases (by 
about 20% in going from 19.4 to 53 GeV). (b) Compari- 
sou of the data (Ref, 74) at W=53 GeV on DP (t°/17)4+ X 
at 9, m= 90° versus ~: with the QCD results, 


smaller than in the quark-quark scattering ap- 
proach. Figures 16, 17, and 18 show that the 
number of away hadrons carrying a certain frac- 
tion z, of the trigger momentum is predicted to 

be 3 to 4 times less than the FFF results, and now 
agrees quite well with experiment. This reduction 
in the away-side multiplicity function, ”(z,) is due 
to three factors. First, we have increased 

& 4 from 500 to 848 MeV. This results in the 
large Q, values shown in Table I (compare with 
Table V in FFF) and thus to a reduction of x(z,). 
Second, the fragmentation functions D*(z, Q*) de- 
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FIG. 14, Prediction of the jet-to-single-7° ratio 8.m. 
= 90° versus *, for W=500, 53, and 19.4 GeV from the 
QCD approach using A=0,4 GeV/c, The jet cross section 
is defined as the cross section for producing a parton 
(quark+ antiquark+ gluon) with the given %,, Also shown 
is the prediction from the quark scattering model of FF1 
which is independent of W at fixed *, and &m. 
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FIG, 15. Toward-side correlation measurements from 
CCRS collaboration (Ref. 75) together with the predic- 
tions of the QCD approach with A4=0,.4 GeV/c and the re- 
sults of the quark-quark black-box model of FFF, Pos- 
sible background contributions from the fragmentation 

of the beam and target are not included, 


3336 R, P. 


CCRS Trigger 7°: p, > 3 GeV/c, TOs OF, < HO? 


trigger = 


"Opposite Side” Chorged Porticies 
1O-Frrigger ~ 180° $ 5°, 77° 5 ef, 


are 108° 
a Js = 62.7 Gev acp 
»——* QCD (no gluons} 
~-—- FFF results 
e Js = 30.6GevV ---- QCD 


(fo) dap, dedy 


Py (GeV/c) 


FIG, 16, Opposite or away-side correlation measure- 
ments from the CCRS collaboration (Ref, 75) together 
with the prediction of the QCD approach with A=0,4 
GeV/c and the results of the quark-quark black-box mo- 
del of FFF, Possible background contribution from the 
beam and target jets have not been included. 


AWAY SIDE W=53 GeV 
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FIG. 17. The dependence on the trigger >. of the away- 
side hadron multiplicity (2) = (1/0)do/dz,, where 2, 
= —>, (away)/p, (trig) from the British-French-Scandi- 
navian collaboration (Ref. 43) on Pb h{+h3+X at W 
=53 GeV, 9 = 90° and with an away-side acceptance of 
25° in? and |¥,| <1, |Ry!<0.5 GeV/c. The predictions 
from the QCD approach with A=0.4 GeV/c (Solid curves) 
and the results of the quark-quark black-box model of 
FFF (dash-dot curves) are shown. Background contribu- 
tions from the fragmentation of the beam and target (see 
Fig. 1 and Fig. 6) which might be important for low-/: 
triggers have not been included in either the QCD or FFF 
predictions. 
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Awoy Side W=53 Gev 
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FIG, 18, The dependence on the trigger >. of the num- 
ber of away-side hadrons per trigger with 2,=0,5 (a) and 
2,=1,0 (b) from the CCHK collaboration (Ref, 15) on pp 
— his h§4 Xat W=53 GeV, and 9, averaged over 45° and 
20° with an away-side acceptance of 40° in ¢ and | ¥, | 
=3, The predictions from the QCD approach with A= 0.4 
Gev/c (solid curves) and the results from the quark- 
quark black-box model of FFF (dashed curves) are 
shown, Background contributions from the beam and 
target jets (see Fig, 1 and Fig. 6) which might be tm- 
portant for low-P, triggers have 70t been included in 
either the QCD or FFF predictions, 


crease as Q increases (see Fig. 5) and are 
smaller at high z than the FFF values (which now 
correspond to @?=@,?=4 GeV’). Finally, in the 
QCD approach, the away-side constituent is quite 
often a gluon (see Tables III and IV) which produces 
on the average fewer hadrons at large z, than do 
quarks (see Fig. 19). However, as Table V and 
Fig. 16 show, the number of away hadrons with 

4, 20.5 arising from gluon jets is still about half 
the total. (The fraction decreases as x, in- 
creases.) This means that the away-side multi- 
plicity (z,) is sensitive to the essentially unknown 
gluon distributions G,_.,(x, @?) and D}(z, Q?). 

For both the QCD approach and the quark scat- 
tering model, the away-side multiplicity function, 
n(z,), is roughly independent of the trigger mo- 
mentum over the range 2.0 <p, (trig) <6.0 Gev/e 
at W=53 GeV. This means that the rise in the 
data (Figs. 17 and 18) at small p,(trig) must be 
ascribed to “background” from the beam and tar- 
get jets (see Fig. 1). We do not know how to cal- 
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FIG. 19. The number of away-side charged hadrons 
per trigger, ”(Z,), arising from the case that the away- 
side constituent is a quark or antiquark (solid curve) or 
a gluon (dashed curve) from the QCD approach with A 
=0,.4 GeV/c, The results are calculated for Pp 79+ h* 
-X at W=53 GeV, 9,= 90° with 3,0 =), (trig) =4.5 GeV/c, 


culate this properly at present, but estimates we 
have made indicate that this is indeed possible.© 
The rise at small p,(trig) in the CCHK experi- 
ment (Fig. 18) is larger than that seen in experi- 
ment R-413 (Fig. 17) because the former has an 
away -side rapidity cut of {| Y|<3 while the latter 
has |Y|<1, The CCHK experiment thus receives 


QUANTUM-CHROMODYNAMIC APPROACH FOR THE... 


3337 


a larger background contamination at low /p,(trig) 
particularly from the Type I background shown 
in Fig. 6. 


2. Away-side particle ratios 


Another effect of the presence of gluons is that 
the away-side positive to negative particle ratios 
at the ISR (low x,) are predicted to be consider- 
ably different than in FFF. Figure 20 shows that 
the QCD approach yields almost equal numbers of 
positives and negatives for p,(away) >1.5 Gev/e 
at W =53 GeV and 3.0 <p, (trig) <4.0 GeV/c in 
agreement with the recent ISR data.** If the away- 
side constituent is always a quark or antiquark 
as in FFF, then this ratio is predicted to be about 
1.5 in gross disagreement with the experiment.** 
However, both the FFF model and the QCD ap- 
proach predict little dependence of the away-side 
particle ratios on the type of trigger species 
(ie., 7°, 7,K*,K7~). This is because the scatter- 
ing forces do not involve flavor exchange (they 
are due to gluon exchange). Neither the QCD ap- 
proach nor the FFF model can explain the ap- 
parently large increase in the away-side positive 
to negative ratio when triggering on K~ observed 
by R-413 (Fig. 20). The discrepancy can be seen 
more clearly in Fig. 21 where we compare the 
predictions for the away-side rapidity spectrum 
of positives and negatives for a ™ and K~ trigger 
with the preliminary R-413 data.*® 

This question of the flavor dependence of the 
constituent subprocesses is an important one. 

In models such as the constituent-interchange 
model (CIM),°**! the scattering forces arise from 
the exchange of quarks which carry flavor.” In 
these models, drastic changes can occur in the 
away-side particle ratios as one changes trigger 
species.®? Figure 22 shows data from the Fermi- 
lab experiment E-494 (Ref. 64) on the away-side 
multiplicity of 7°, 7”, K*, and K~ with z, 20.5 
for a trigger meson of type 7*, 7 ,K*,K7 at W 


TABLE V. Total away-side multiplicity (Ref. 59), N(z,=0.5), per trigger for charged had- 
rons in the processes pp—1°+h*+X at @,=90° predicted from the QCD approach with A=0.4 
GeV/c. Also shown are the individual contributions to the multiplicity for z,=0.5 from gluon 
and quark fragmentation. [The function N(zp, P43, 6;) is defined by Eqs. (6.1) and (6.2) in FFF.] 


Quark Gluon 
W (GeV) P, (GeV/c) Xy fragmentation fragmentation Total 2,=0.5 
53 2.0 0.08 0.047 0.083 0.130 
53 3.4 0.13 0.052 0.050 0.102 
53 4.5 0.17 0.058 0.047 0.105 
53 5.3 0.20 0.063 0.038 0.101 
53 9.3 0.35 0.060 0.022 0.082 
53 13.2 0.50 0.058 0.014 0.072 
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FIG, 20, The number of away-side positive and nega- 
tive hadrons with ?, (away) >1.5 GeV/Cper trigger with 
3.0 =p1 (trig) =4.5 GeV/c from the BFS collaboration 

(R-413) (Ref. 43) on Pp h, + k* +X where W= 53 GeV 
and @,=90° and |¥,|<1.0, The results for ™, ™, K’, 
and x triggers are shown and compared to the predic- 
tions of the quark-quark black-box model of FFF and the 
QCD approach with A= 0,4 GeV/c (open and solid 
squares), Background contributions from the beam and 
target jets (see Figs, 1 and 6) have “of been included in 
either the QCD or FFF predictions, 


=27.4 Gev and 3.0 <p,(trig) <5.0 GeV/c compared 
to the QCD predictions. The agreement is quite 
good. The away-side ratios are roughly indepen- 
dent of the trigger species and given approximately 
by the single-particle ratios (shown by the wiggly 
arrows along the side) which is just as expected 
for a flavorless-exchange constituent subprocess. 
There is a slight disagreement for the K™ trigger 
but the data do not show the large positive plus 
negative sum seen by R-413. In fact, the away- 
side number of K* mesons with a K™ trigger is 
correctly predicted. The data in Fig. 22 from 
E-494 are taken off a beryllium target and there 
are A-dependence corrections™ (we have made no 
A-dependence correction to our theoretical pre- 
dictions) that make direct comparison a bit 
dangerous. Because of this and because of the 
apparent disagreement between R-413 and E-494 
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FIG. 21, The away-side rapidity distributions, (1/N,,,) 
dN/d¥, of positive and negative hadrons with f: (away) 
>1,5 GeV/c for 90° ™ and K” triggers in the range 3,0 
=p, (trig) =4.5GeV from the BFS collaboration (R- 
413) (Ref. 43), Predictions of the QCD approach with A 
= 0.4 GeV/C¢ are shown whame background contributions 
from the beam and target jets (see Figs, 1 and 6) have 
not been included, 


for K~ triggers the question as to whether or not 
there is any evidence for flavor exchange in the 
constituent subprocess is unsettled. 


out 


Due to our use of (k,),..= 848 Mev and (R,). 4 
=439 MeV, the mean values of Por are predicted 
to be considerably larger than the results of FFF 
(Ry), +¢=500 MeV, (.),..7330 Mev). In Figs. 
23 and 24, we compare both the new QCD results 
and the FFF results with the mean values of Puy, 
obtained in the CCHK experiment. The value of 
(R.),~q = 848 MeV, obtained from the fit to the data 
on pp— u* u~ +X shown in Fig. 7, results in (P,,) 
values that agree better with the hadron experi- 
ments, although they are still a bit small. (Some 
of the discrepancies may be due to contributions 
from the beam and target jets omitted in our 
analysis.) 
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FIG, 22, The number of away-side mesons (of type 7”, 
+, K*, K”) with 2 =0.75 per trigger (f type 7, 7, 
K*, K°) from the Fermilab experiment E494 (Ref, 64), 
The data are taken at W = 27.4 GeV with 3.0 =p, (trig) 
=4,0 for proton-beryllium collisions and are compared 
with the prediction of the QCD approach (for proton-pro- 
ton collisions) with A=0.4 GeV/c, No A-dependence 
corrections have been made to the theory or the data. 


4. Experimental tests for effects due to Kyog 
As seen in Figs. 2 and 8, the basic constituent 
subprocess of QCD (before smearing) behaves 
roughly like 1/p,° at fixed x, for 2<p, <10 Gev/ 
c. The experimentally observed 1/p,° behavior 
is obtained by including the effects of smearing 
((k),4¢ #0) which raise the small-p, prediction 
while leaving the large-p, region essentially un- 
changed. This increase at small ~,, due to the 
“trigger bias” effect, can be partially removed 
by triggering on events with equally large p,’s 
on the toward and away-side {i.e., 2, ~1).% 
Thus, in general, we expect the p, dependence 
of the two-particle back-to-back cross section to 
differ (in the region where smearing is an im- 
portant effect) from that of the single-particle 
cross section. This is seen in Fig. 25 where we 
plot the two-particle back-to-back cross section 
ds /dz, at 2, =1 (times p,°) versus p, at x, =0.35. 
It behaves roughly like 1/p,° over the range 4 
=p, <6.0 Gev/c whereas the single-particle 
cross-section results, when multiplied by p,°, 
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FIG, 23, The dependence on z, of the mean value of the 
'Puur | of away-side charged hadrons at W= 53 GeV and 
2.0 =p, (trig) =4.0 GeV/cwith 9, averaged over 45° and 20° 
from the CCHK collaboration (Ref.15) on Pp— hj + hy+X, 
The predictions from the QCD approach at 9,= 45° with 
A=0.4 GeV/c and &), = 848 MeV (solid curve) and the 
results of FFF (dashed curve) curve are shown, 
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FIG, 24, The dependence on the trigger /. of the mean 
value of the |P,! of away-side charged hadrons with 
0.6 = % =0.7 at W=53 GeV and 9 averaged over 45° and 
20° from the CCHK collaboration (Ref. 15) on PP ht 
+ h=X, The predictions for 9;=45° from the QCD ap- 
proach with A=0.4 GeV/c and &), = 848 MeV, ®)y» 
=439 MeV (solid curve) and the FFF results (dashed 
curve) are shown, 
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FIG, 25, Comparison of the behavior of /,° times the 
single-charged-particle cross section Edo/dp (pp— hj 
+X), and p.® times the two-particle back-to-back cross 
section do/dzp|_,-, (PP hy +h>+X) at fixed x1=0,35 
(times 40), The QCD predictions are calculated at 9 
=90° with A=0.4 GeV/c and (ky), - ¢ =848 MeV. 


are roughly independent of p, over the range. The 
two-particle back-to-back cross section do/ 
dz,(z, ~1) reflects more closely the dependence 
on p, of the basic subprocess without the additional 
scale breaking due to smearing. 

The predictions in Fig. 25 (and in all figures 
in this paper) are free from any beam and target 
jet background of the type discussed in Sec. IVD1 
above. As seen in Fig. 18, below p, =3.5 Gev/c 
this background is important and is presumably 
the cause of the rise of dN/dz, at low p, (trig). 
Any such increase of the expected dN/dz, at low 
bp, due to background would vitiate the comparison 
in Fig. 25 by making it behave similarly to the 
single-particle cross section. The test must be 
performed at p,’s large enough So that the back- 
ground contamination is negligible. This is why 
we calculated the results in Fig. 25 at x, =0.35 
so that p, 2 4 Gev/c. 


E. Very-high-energy expectations 


Figure 8 shows that the QCD predictions quickly 
deviate from a 1/p,® behavior (at fixed x,) as the 
P. increases yielding a much larger cross section 
than expected from the black-box model. This is 
also seen in Fig. 26 where we plot the QCD pre- 
dictions for p,° times Edo/d*p versus p, at x, 
=0.05 and 9,,, =90°. At W=500 Gev, the QCD 


xt 0.05 8. m.7 90° pp 17° +X 
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FIG. 26. The behavior of P,° times the 90° single~7® 
cross section, Edo/d%p, at x,= 0.05 versus P; calculated 
from the QCD approach with A= 0.4 Gev/c (solid curve) 
and A= 0.6 GeV/c (dashed curve). The two low-/; data 
points are at W=53 and 63 (Ref, 74). The predictions 
area factor of 100 (1000) times larger than the flat 
(61%) extrapolation to W=500 GeV (1000 Gev). 


results are a factor of 100 larger than a straight 
(1/p,°) extrapolation and show a factor of 1000 
increase at W=1000 GeV. In Fig. 27 we display 
the predictions for 90° 7° and jet production at 
fixed W=53, 500, and 1000 Gev versus p,. The 
preliminary high-p, data from CCOR (Ref. 67) 

at W=53 GeV are also shown. The black-box 
model and the QCDpredictions agree with each 
other and both agree with the data. By going to 
higher energy, one can easily discriminate be- 
tween the two approaches. For example, at W 
=500 GeV and ~, =30 GeV/c, the 7° (jet) cross 
section from QCD is roughly a factor of 100 (500) 
times larger than the FF1 results. In fact, the 
Dp, =30 Gev/e 90° 7° cross section at W=500 GeV 
is predicted in the QCD approach to be about the 
same magnitude as that measured at p, =6.0 
Gev/c at Fermilab (W =19.4 GeV). 

These large single-particle and jet cross sec- 
tions (See also Fig. 14) predicted by QCD, if 
correct, will make it very difficult, if not im- 
possible, to find the W boson (and other new parti- 
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FIG. 27. Comparison of the results on the 90° 7° cross 
section, Edo/d3p, from the QCD approach with A= 0.4 
GeV/¢ (solid curve) and the quark-quark black-box mo-~ 
del of FF1 (dotted curves). Both models agree with the 
data at W=53 GeV (crosses = Ref, 52) where the open 
squares are the “preliminary” data from the CCOR col- 
laboration (Ref, 67) normalized to agree with the lower~ 
Pb, experiments. The QCD approach results in much 
larger cross sections than the FF1 model at W=500 and 
1000 GeV. The FF1 results at 1000 GeV (not shown) are 
only slightly larger than the results at 500 Gev. Also 
shown are the cross sections for producing a jet 
at 90° (divided by 1000) as predicted by the QCD ap- 
proach (dashed curves) and the FF1 model (dot-dashed 
curve), 


cles) from its 97 or jet-jet decay. Quigg®® showed 
that even the black-box (1/p,*) model extrapolation 
led to a W signal that was, at best, 10 times the 
hadronic (jet-jet) background. The factor of 500 
(1000) increase in this background at W=500 
(1000) GeV predicted is obviously fatal. However, 
if one indeed observes such large production rate 
for single particles and jets, then QCD will be 
verified and this may be as important as discover- 
ing the WV boson. 

It is not clear yet precisely what the quark and 
gluon jets will look like at very high p, (such as 
>. =30 GeV/c). If QCD is correct, they will cer- 
tainly not look like the well collimated (#,),., 
= 130 Mev objects we use in this calculation and 
illustrated in Fig. 1. At p,=30GeV/c, they should 


“appear” to be fatter. This is because as the 
P, of the outgoing quark is increased, it becomes 
increasingly more likely that it radiate a gluon 
and become two jets (one quark and one gluon). 
Then, this quark or gluon might radiate producing 
still more jets. The net result is that most of 

the time it will look as if there is one fat jet; how- 
ever, occasionally when the radiation is hard 
enough, one will see the two or three distinct 
subjets.°°~7! Much theoretical effort is being 
focused on such queStions and we should soon have 
a good idea of precisely what to expect at very 
high energies and p,’s. 


Vv. SUMMARY AND CONCLUSIONS 


If this work is viewed as simply a comparison 
of one phenomenological model against another 
(e.g., the QCD approach versus the black-box 
model), not much can be said to favor one over 
another. It is true that the QCD approach has 
fewer free parameters in the parton cross sec- 
tions di/d?, but there are more free choices in 
the gluon functions. More excuses are needed 
concerning background effects at low p,, etc. 

It is true that the black-box pure quark scheme 
could not fit the away-side large-p, particle multi- 
plicities and charge ratios, but it probably could 
be fixed up with the inclusion of gluons. It has 
become apparent that present high energies are 
not really high enough to isolate the manifold of 
effects (parton distributions, fragmentation func-~ 
tions, constituent cross sections, transverse 
momentum of partons, different kinds of constit- 
uents, etc.) that are mixed together so inti- 
mately in today’s experiments. If the resolution 
of this would depend entirely on experiment, we 
shall have to end this long research with the tire- 
some and obvious call for still higher energies. 
At high ~,, predictions of the QCD approach are 
orders of magnitude greater than the black-box 
b,~® extrapolations, so clear tests lie there. 

But QCD is more than a phenomenological model. 
It is a precise and complete theory purporting to 
be an ultimate explanation of all hadronic experi- 
ments of all energies, high and low. There are 
many reasons to hope and expect it to be right. 
The question is, is it indeed right? Mathematical 
complexity has, so far, prevented us from quan- 
titatively testing its correctness. What it predicts 
is not clearly known. Nevertheless, its property 
of asymptotic freedom leads us to expect that 
phenomena of high momentum transfer should be 
analyzable (by perturbation theory). Yet experi- 
ments at what was thought to be high enough p, 
seemed to show Pp, ~® behavior unlike the expected 
bP, (with possible logarithmetic modifications). 
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It was a mystery. Although many people said 
‘perhaps the energy is not high enough, ” the re- 
mark was Simply an article of faith; the mech- 
anism leading to an apparent eighth power in the 
experimental region remained unknown. 

We believe we have resolved this mystery, using 
the QCD theory itself to tell us what might happen 
in the range in question. There is, from the point 
of view of QCD, no mystery. The energy (/,) is 
indeed too low and there are too many nonasymp- 
totic effects acting. Results closer to a p.~ fall- 
off should appear only at much higher p, (see 
Fig. 9). Machines currently planned for these 
energies will resolve the question of models as 
soon as they are turned on. 

On the other side, there is a great deal of data 
now available at energies and p, values in which 
asymptotic free field theory can make much more 
precise predictions than have yet been made. The 
QCD theory, unlike other phenomenological ap- 
proaches, is complete mathematically So that a 
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A model is analyzed that provides a parametrization of the properties of the jet of 
mesons generated by a fast outgoing quark. It is assumed that the meson that contains 
the original quark leaves momentum and flavor to a remaining jet in which the particles 
are distributed (except for scaling of the energy and possible changes of flavor) like 
those of the original jet. One function, the probability f(n) that the remaining jet has a 
fraction n of the momentum of the original jet, is chosen (as a parabola) so the final dis- 
tribution of charged hadrons agrees with data from lepton experiments. All the proper- 
ties of quark jets are determined from f(n) and three parameters; the degree that SU(3) 
is broken in the formation of new quark-antiquark pairs (ss is taken as half as likely as 
uu), the spin nature of the primary mesons (assumed to be vector and pseudoscalar with 
equal probability), and the mean transverse momentum given to these primary mesons. 
Monte Carlo methods are used to generate typical jets. Analy tic approximations are also 
given. Many features of quark jets are examined. The distribution of momentum of 
various hadrons Dg(2), the properties of the hadrons of largest momentum in the jet, 
correlations, rapidity-gap distributions, distribution of charge and of transverse momen- 
tum are some of the subjects discussed. The appearance of the jets to an instrument 
sensitive only to particles above some minimum momentum is also described. Although 
the model is probably not a true description of the physical mechanism responsible for 
quark jets, many predictions of the model seem quite reasonable, possibly much like 
real quark jets (except that the possibility of the emission of baryons is disregarded). 
The purpose of this work is to provide a model useful in the design of experiments in 
which quark jets may be observed, and further to provide a standard to facilitate the 
comparison of lepton-generated jets with the high, jets found in hadron collisions. 


1. Introduction 


Recent data from ISR [1,2] and Fermilab [3] indicate that the “jets” observed 
in large-p, hadron-hadron collisions are similar to those in processes initiated by 
leptons (i.e., e*e , ep, and yp processes). The “jets” observed in both cases are 
thought to arise from quarks that fragment or cascade into a collection of hadrons 
moving in roughly the direction of the original quark. 


* Work supported in part by the US Energy Research and Development Administration under 
Contract No. EY76-C-03-0068. 
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The experimental verification of this should proceed simultaneously in two 
directions. First, the detailed properties of the jets produced in lepton reactions 
must be examined. (Incidentally, we must confirm via charge properties, etc., that 
these jets could actually arise from quarks.) Secondly, the hadron-initiated large-p, 
jets must be compared in great detail to the lepton jets to determine if they are 
actually identical. (Some theorists believe that gluon jets will be produced at large 
p, in hadron-hadron collisions in addition to quark jets.) At the present time there 
is little data of either kind. However, hadron “jet trigger” experiments are proceed- 
ing or are planned. There is no comprehensive theory of the details of the jet struc- 
ture that we should expect to observe. So, it will be difficult to know in, say, a 
hadron experiment what data to collect and how to summarize it so that it will be 
useful to compare to some future lepton experiment which will probably not mea- 
sure precisely the same thing. We thought it might prove useful to have some easy- 
to-analyze “‘standard”’ jet structure to compare to. Thus, a hadron experiment could 
say “‘the real jets differ from the ‘standard’ in such and such a way”, and the lepton 
experiment could then see whether they deviated from the same ‘standard’ in a 
similar way. 

In a previous paper [4] (hereafter called FF1), we used limited experimental data 
aided by some theoretical ideas to suggest parametrizations for the functions Dh (2), 
the mean number of hadrons of type h and momentum fraction z (per dz) in a jet 
initiated by a quark of flavor q with high momentum. The model presented here pro- 
vides a new, much simpler, parametrization for these functions, making only negli- 
gible changes for those functions that were determined by experiment and otherwise 
in agreement with all the theoretical ideas in FF1. 

A virtue of the new model is that it gives detailed answers to many other ques- 
tions such as the number of correlated pairs of hadrons h,, hz at z, and Z, or the 
distributions of momentum gaps containing no hadrons, or the probability of observ- 
ing various total jet charges, etc. In addition, one can ask for the probability that a 
quark of large momentum fragments so that the sum of the fractional momenta Z,, 
Z2, +. of all those hadrons with z; > Zp jn is z. The latter is useful in analyzing “jet 
trigger” experiments. In this type of experiment, one triggers on a collection of par- 
ticles that sum to give a large p,. It is experimentally very difficult to define a “jet”. 
One can never be sure that all the low-momentum particles from the quark are 
included or that one has not included some extra low-p, particles from the back- 
ground of particles moving in the beam or target jets and not properly belonging to 
the transverse jet. These experiments would be much cleaner if one sets a threshold 
Pig» Say, 500 MeV, for the transverse momenta of the particles whose total momen- 
tum makes up the trigger (SO Zmin = Pig/P i(quark)). 

We generate typical jets using Monte Carlo methods but provide an analytic 
approximation for the convenience of the reader. The predictions of the model are 
reasonable enough physically that we expect it may be close enough to reality to be 
useful in designing future experiments and to serve as a reasonable approximation to 
compare to data. We do not think of the model as a sound physical theory, and dis- 
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cuss why in a later section. Also, as worked out here, the model does not include 
baryons. We must imagine the jets to contain only mesons because we have so little 
knowledge at present of what baryons to expect and the character of the model 
does not make a clear suggestion. It also makes no clear suggestion of what correla- 
tions in transverse momentum of the hadrons to expect. We have added an addi- 
tional assumption to determine this which seems reasonable to us but may be far 
from the physical situation. Some of the features of the transverse-momentum dis- 
tributions are, however, strongly affected by the fact that the particles observed are 
often products of decays of higher resonances. These effects are not dependent on 
the details of our jet model and should be important also in analyzing the beam and 
target jets in ordinary inelastic hadron-hadron collisions. 

Our quark-jet model involves one arbitrary function, the probability f(n) that 
the hadron containing the original quark leaves the remaining jet a fraction 7 of its 
momentum. This ultimately determines the momentum distribution of hadrons. 

We have found that taking f(n) to be a parabola with one adjustable parameter 

results in an adequate fit to the distribution of charged hadrons, Db” (z) +Di(z), 
observed in lepton experiments. All the properties of quark jets are then determined 
from f(n) and three additional parameters; the degree that SU(3) is broken in the 
formation of new quark-antiquark pairs (ss is taken as half as likely as tu), the spin 

of the primary mesons (assumed to be vector and pseudoscalar with equal probability), 
and the mean transverse momentum given to these primary mesons. This later param- 
eter is determined by requiring that the final hadrons (after decay) have a mean 
transverse momentum of about 330 MeV. 


2. The model 
2.1. The ansatz 


We assume that quark jets can be analyzed on the basis of a recursive principle. 
The ansatz is based on the idea that a quark of type “a” coming out at some momen- 
tum Wo in the z direction creates a color field in which new quark-antiquark pairs 
are produced. Quark “a” then combines with an antiquark, say “b’”’, from the new 
pair bb to form a meson “ab” leaving the remaining quark “b” to combine with 
further antiquarks. The “meson” ab may be directly observed as a pseudoscalar 
meson, or it may be a vector or higher-spin unstable resonance which subsequently 
decays into the observed mesons. To avoid complicating the ideas, we will call ‘‘ab”’ 
the “primary” meson state and shall discuss secondary decay processes later. A 
“hierarchy” of primary mesons is formed of which ab is first in “rank”’, bé is second 
in rank, cd is third in rank, etc., as shown in fig. 1. (The “rank” in “hierarchy” 
should not be confused with order in momentum, but only order in the flavor rela- 
tionships. The rank-2 primary meson may sometimes obtain a larger momentum 
than the rank-1 primary meson.) 
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"HIERARCHY' OF FINAL MESONS 


3 3 2 i | = RANK 
(df) (fc) (eb) (be) ~— (Ea) 


‘the SOME "PRIMARY" 
MESONS DECAY 
3 2 1 = RANK 


(dc) (Eb) (ba), “4 
PRIMARY" MESONS 


ARE FORMED 
NEW QUARK PAIRS 
dd ce bb 


bb, cc, ... ARE FORMED 


ORIGINAL QUARK 


Hoe 


OF FLAVOR a 


Fig. 1. IMustration of the “hierarchy” structure of the final mesons produced when a quark of 
type ‘‘a”’ fragments into hadrons. New quark pairs bb, cC, etc., are produced and “primary” 
mesons are formed. The “‘primary”’ meson ba that contains the original quark is said to have 
“sank” one and primary meson cb rank two, etc. Finally, some of the primary mesons decay 
and we assign all the decay products to have the rank of the parent. The order in “hierarchy” 
is not the same as order in momentum or rapidity. 


The “chain decay” ansatz * assumes that, if the rank-1 primary meson carries 
away a momentum £, (from a quark jet of type “a” and momentum Wg) the remain- 
ing cascade starts with a quark of type “b” with momentum W, = Wo — &, and the 
remaining hadrons are distributed in exactly the same way as the hadrons which 
come from a jet originated by a quark of type “b” with momentum W),. It is further 
assumed that for very high momenta, all distributions scale so that they depend only 
on ratios of the hadron momenta to the quark momenta. Given these assumptions, 
complete knowledge of the structure of a quark jet is determined by one unknown 
function f(m) and three parameters describing flavor, primary meson spin, and 
transverse momentum to be discussed later. The function f(n) is defined by 


f(n) dn = the probability that the first hierarchy (ranx-1) primary meson 
leaves the fraction of momentum 7 to the remaining cascade, (2.1) 


* We believe this recursive principle was first suggested by Krywicki and Petersson [6] and by 
Finkelstein and Peccei [7] in an analysis of proton-proton collisions. 
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and is normalized so that 


1 
f fayan=1. (2.2) 
0 


The rank-1 primary meson (with momentum fraction z, = £,/Wo) contains the 
original quark “a” (see fig. 1) and y= 1 —z,. Thus for an initial quark with mo- 
mentum Wo, the probability that the first-hierarchy primary meson has momentum 
E, in d& is f(1 — £,/Wo) dé, /Wo and the probability that the rank-2 primary 
meson has momentum £, in dé, is f(1 —£/W,) d&./W,, where W, = Wo — £1, etc. 
The probability that we have a hierarchy sequence of primary mesons with the kth 
having momentum &; in dé, is 


Prob(E,, &2, +, &%) dé, dé ... dé, ...= II Fri) ani » (2.3) 


where 7; = W,/W;—, with &;= W;_, — W;. That is, n;= 1 — &;/W;, and dy; is to be 
replaced by dé;/W; with W;= Wy ~— Dixy Ex. 


2.2. Single-particle decay distribution F(z) 


The above ansatz leads to an obvious and simple Monte Carlo calculation of a 
jet as well as to a straightforward recursive integral equation. For example, if we 
define a single-particle distribution in the quark jet as 


F(z) dz = the probability of finding any primary meson (independent of 
hierarchy) with fractional momentum z within dz in a quark 
jet, (2.4) 


then F(z) must satisfy the following integral equation (take Wp = 1) 
1 
F@)= (1-2) + f fln)F@/n) anin, (2.5) 
Zz 


where the limits are automatic since we define f(1 —z) = 0 and F(z) =0 forz >1 
or Z <Q. Eq. (2.5) arises because the primary meson might be the first in rank (with 
probability f(1 — z) dz) or if not, then the first-rank primary meson has left a mo- 
mentum fraction 7 with probability f(m) dn, and in this remaining cascade the pro- 
bability to find z in dz is F(z/n) dz/n by the scaling principle. Dividing out the dz 
leaves eq. (2.5). 

An integral equation for F(z) given f(n) as (2.5) involves only differences in 
rapidity (Y, = —In z) * and hence can easily be analyzed by a Fourier transform 


* in this paper, we will use the word rapidity to refer either to the “z rapidity” given by 
Y, = —In z in (3.4) or the true rapidity given by eq. (3.5). Usually the difference between the 
two will not be important; however, when it is, we label the former by Y, and the latter by 
Y, 
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in rapidity, or equivalently by taking moments in z. If we define 
1 


MQ) = af z’F(z) dz, (2.6a) 
0 
1 
C= f n'fn)an, (2.6b) 
10) 
then eq. (2.5) takes the form 
M(r)= Ar) + CQ) M(r) (2.7a) 
or 
M(r)= AM/U - C0); (2.7b) 
where 
1 
A(r)= i z’f(i —z)dz. (2.7c) 
0 


The function A(r) is, of course, related to C(r) for they are determined by the same 
function, but the point we wish to make here is more general. The integral kernel of 
(2.5) can be inverted algebraically in moment space as 


1/0 —-C@))=1+C@M/U -Ce)). (2.7d) 


Thus an equation of the form 


g@)= a(z) + { fn) 6@/n) anin (2.8a) 
can be inverted to give 

$@)= az) + fg(n) a/n) anim, (2.8b) 
where 

Jf r/a(n) an= C)/ — CO) « (2.8¢) 


For example, (2.5) is solved by 


1 
F@)= fl ~z)+ f a(n) fU —2/n) dnin, (2.9) 


which can be interpreted in the following way. If a particle is found at z, either it is 
a first-rank primary meson or else the particles of lower rank have left a momentum 
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n, and it is the first of those remaining (i.e., f((1 ~2z/n) dz/n). That is, g(n) dn has the 
significance of being the probability that all the primary mesons of lower rank than a 
given particle have left momentum fraction 7 of the original momentum of the jet. 

The probability normalization of f(7) given by (2.2) means that C(O) = 1. Then 
from eq. (2.5) (or (2.7b) since A(1) = 1 — C(1)), one finds that the total momen- 
tum of all the primary mesons in the quark jet, M(1), is unity (i.e., equal to the 
original quark momentum). Namely, 


1 
ap zF(z) dz=1. (2.10) 


0) 


Another important point is that for small z, eq. (2.5) implies that F(z) has the 
expected Rdz/z behavior (R = constant). This represents a uniform distribution in 
rapidity, Y, = —In z. We see from (2.7b) that M(r) diverges as r > 0, since C(O) = 1, 
as R/r where 

1/R = —dC(r)/drl,<o , (2.1 1a) 


or 


1 
1/R= -f in nf(n) dn . (2.11b) 
0 


For small r, the integral (2.6a) is dominated by small z if F(z) is R/z and is R/r. This 
may also be understood as follows. Since rapidity Y, is —Inz, 1/R from (2.11b) is 
the mean loss of rapidity, In 7, per primary meson. Thus, deep in the cascade we 
must have R primary mesons per unit of rapidity dY, = dz/z. For small n, in the 
plateau, g(7) in eq. (2.8c) is given by g(n) = R/n. 


2.3. Double-decay distribution F(Z, Z2) 


Suppose we want the probability of finding two primary mesons, one at z,, the 
other at Z2, regardless of their rank in hierarchy. First, we define the function 
F, (1 , Z2) by 


F(Z, Z2) dz, dz = the probability of finding two primary mesons, one 
at Z, of any rank and one at Z3, but of higher rank 
than the one at z,. (2.12) 


We can immediately write the integral equation 


F,@,, 272)= fl —21) F@2/(1 —2,))/ —21) 


+ { f(r) F2@,/n,22/n) an/r? . (2.13) 


The first term arises because z,; might be the first-rank primary meson (probability 
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f(i — 2,)dz,), and z any primary meson in the following cascade (of total mo- 
mentum 1 —z,) and hence with probability F(z. /(1 — z,))dz2/(1 — 21). The sec- 
ond term arises if z, is not of rank one. The rank-one primary meson leaves the 
others with momentum 7 with probability f(j)dn and then we find z,, Z2 with 
scaled probability F,(z,/n, 22/7) dz,/n dzz/n. Eq. (2.13) can be solved by the 
techniques discussed in subsect. 2.2, whereupon one gets 


F2@, 22)= fi — 21) F@2/ —21)/0 —21) 


1 
+ ff. -21/n) F@a/(n— 21) anl(nn —21)) - (2.14) 


Z1+22 


The complete double-fragmentation function, where the meson at 2. may be either 
of higher or lower rank than that at z,, is then given by 


Fy, (21, 22) = F2 (21,22) + F222, 21) - (2.15) 
2.4. Choosing the form of f(n) 


A form which makes the solution of the integral equation (2.5) the simplest is a 
power of n, 


f(n)= d+ In? . (2.16) 
In this case 

CO)= @+ Dir +d+ 1) (2.17a) 
or 

cry -C@))= @+ Dir, (2.17b) 
so that 

a(n)= dt l)/n, (2.18a) 
and hence 

zF(z)= +1) —z)?. (2.18b) 
This means that 

zF(z)= fll —z), (2.18c) 


which serves as a rather rough approximation for other forms of f(7). This power 
form for f(n) leads to a double-decay function of the form 


dz, 
Zz, +z,” 


dz 
F2(Z1, 22) dz) dzq = f(l ~2, — 22) (2.19) 


or disregarding order of rank, 
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F (21, 22)=f0 — 21 ~ 22)/2122 5 (2.19b) 


which can also serve as a rough approximation for other choices of f(n) *. The inclu- 
sive probability to find V mesons with momenta Z,, Z2,... Zy is 


N N 

& dz; 

Fy (eq, Zp.» Zy) dz, dzy = (d+ DNC — 27 2,)7 I (2.20) 
i=1 i=1 2; 


This is the manner in which mesons are distributed in the one-dimensional field- 
theory model of Casher, Kogut and Susskind [10]. In fact, it results simply from 
the assumption of particles produced independently at random having an a priori 
probability 8 to exist and then to be distributed uniformly in two-dimensional rela- 
tivistic phase space (i.e., uniformly in rapidity, dz/z). The total energy and momen- 
tum must, however, be that of the jet (@= d+ 1). 

Although the forms of (2.16) and (2.18b) are simple and easy to interpret, they 
do not agree with the assumption we used in FF1, that the probability of finding 
inesons in dz approaches a constant as z > 1. Eq. (2.5) has the property that F(z) 
approaches f(1 —z) as z becomes large. The mesons at large z almost surely contain 
ihe original quark and at z = 1, the meson must contain the original quark. Thus in 
order for F(z) to approach a constant as z > 1, we must require that f(7) approach 
a constant as 7 > 0, something that the form (2.16) does not do. A function that 
leads to results similar to those found in FF1 is simply (2.16) plus a small constant. 
We take 


f(n)= 1 —a+3an? , (2.21) 


where the parameter a and the power, d = 2, are chosen by comparing F(z) to | 
experiment. (Actually we compare the charged-particle distribution, DP” (z) +D§ (z), 
to data.) The form (2.21) gives 


CY) = (i —a/(rt 1) + 3a/(r +3), (2.22a) 
so that using (2.8c) we find 
a(n) = (3/n + 4a(1 — a)n?~*)/(3 — 2a) , (2.22b) 


and (2.8d) yields 
2F(z)= 3/(3 — 2a) 
+ 3az? /(2a — 1) 
+ 2a(2a? — 3a — 2)z3~74/((3 — 2a)(2a —1)). (2.23) 
As we will see later after discussing flavor, if we interpret all primary mesons as 
* Eq. (2.19) with f(n) = 2n was used by Bjorken in ref. [8] to estimate the large-p 79n0 cross 


section in pp collisions and by Ellis, Jacob and Landshoff in ref. [9] to estimate same side 
large-p, correlations in pp collisions. 
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Fig. 2. (a) (left) The probability of finding the first-rank meson at z (in dz), f(1 —z) [---], and 
the probability of finding any hadron at z (times z), zF(z) [——], given by eqs. (2.21) and (2.23), 
respectively, with @ = 0.88, ap, = 1, ay = 0. (b) (right) The functions ap, f(1 —z) and aps zF(z) 
given by eqs. (2.21) and (2.23) with a = 0.77 together with the distribution of secondary mesons, 
ay f(1 ~ z) and ay zF(z), from parents distributed according to (2.21) and (2.23), where « 
= a, = 0.5 and where we have used the parent-daughter relation in (2.54), —— Ops ZF (Zz); aeeee’s 
aps fl —2)3 —-— + ayzF(z); - - — ay f(1 - 2). 


pseudoscalar mesons with no secondary decays then by comparing pt (z) +Di (@) 
with data and with the results of FF1, we find that a = 0.88 gives a good fit. Fig. 2a 
shows zF(z) and f(1 — z) for this case. We see that the relationship (2.18c) is still 
approximately valid. 


2.5. Including flavor 


Next we examine the question of the flavor u, d,s, i, d, and § of the quarks and 
hence the isospin and strangeness of the primary mesons. We call the isospin and 
strangeness properties the “flavor” of the primary mesons. For example, a “ud” 
primary meson has the flavor of a 7” or equally well ap”. In this section, we will 
discuss everything in terms of the pseudoscalars and in the next section include the 
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possibility of forming vectors or other resonances as well. We assume, as in FF 1, 
that new qq pairs are ul with probability yy, dd with equal probability yg and ss 
with probability ys. From isospin symmetry yy = yg = Y, Say, SO ys = 1 — 2y. As in 
FF1, we suppose that strange s§ pairs are half as likely as unstrange uu pairs so that * 


y=0.4. (2.24) 


Let the primary meson states be defined by quark labels “ab”, etc., where a is u, 
d, or s and likewise for b. A sum on such a label is a sum on u, d, and s. Our assump- 


tions about flavor mean that if the ranks of successive primary mesons are 1, 2, 3,..., 


then their flavors must be of the type of a chain ab for the first with some b, then 
bé for the second, cd for the third, etc., as shown in fig. 1. The probability of find- 


ing them with various momenta is still (2.3) but multiplied by the factor y,, Yc. Ya, -» 


giving the chance that the correct new pairs bb, éc, dd, ... were indeed formed. It is 
now easy to modify what we did earlier in subsect. 2.2 for flavor. 

For example, for a quark of type q, the mean number of primary meson states 
of type ‘‘ab” at z is, in analogy to eq. (2.5), 


PH @)= Baar Ml —2)+ f Hn) 2 yePHE/n) (2.25) 


The first term arises because the “ab” primary meson state might be of first rank 
(only if a= q, of course, hence the delta function 5,4) with probability f(1 — z) 
times the chance, y,, that the first new pair is of the required type b. The second 
term occurs if the “ab” primary meson is not of first rank in hierarchy. The first 
pair might be c¢ (with probability y,) and leave a momentum 7 to the cascade of 
quark c (with probability f(n)) in which cascade we find an “ab” state with proba- 
bility P2°(z/n) dz/n. 

It is seen that the flavor distribution of the 2nd and higher rank primary mesons 
are independent of what quark started the cascade because we have assumed the 
new pairs are made with flavors independent of the quark flavor that makes the 
cascade. Defining a “mean quark” flavor to be (q), and equal to u, d, and s with 
probability y, y, and (1 — 27), respectively, we write 


pak (z) = Liye (). (2.26) 


- It is not unreasonable that ss pairs are formed less often than uu and dd, for s quarks may have 
a larger mass than u or d. For example, in a vector force field (e.g. F = eE where E is an elec- 
tric field and e is the charge on a particle) constant in space and time, the rate of production 
of pairs of particles of rest mass m and transverse momentum k, (kK, is a two-dimensional vec- 
tor transverse to the direction of the field, the particle has transverse momentum kj, the anti- 
particle —k) is, according to the Dirac equation, 


(F/2m) exp(—n(m? + k?)/F) d2k,/(2n)? 


per unit volume per second. Thus it is more difficult to make pairs of mass m by a factor 


exp(—nm2/F). 
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The quantity P(q) appears on the right hand side of eq. (2.25) so multiplying by y, 
and summing on q leaves 


PR)= Yaref(l — 2) + { fln)PARE/n) anin , (2.27) 
which, after comparing with (2.5), yields 


P28 (z)= Ya7pF@) - (2.28) 


Thus if we know the distribution of primary mesons F(z) from quarks disregarding 
flavor, then those of flavor “ab” occur in the “mean quark” cascade with probabi- 
lities, yay}, that the first and the second quark of the pair “ab” have the appropri- 
ate flavors. Only the contribution of the first-rank quark differs from the average. 
Substituting back into eq. (2.25) and integrating, one finds in general that 


P22) = 5qavpf(l —2) + Ya%e F 2), (2.29a) 


where 
F(z)= F(z) -— fi —z) (2.29b) 


is the probability of finding a primary meson at z of rank higher than one, and F(z) 
and f(1 —z) are the functions given in (2.21) and (2.23). 

To be more explicit, the distributions of primary meson states of flavor h from a 
quark q are given by 


D'(@)= Al f(1 —z) + BYF@), (2.30) 


where At and BP = Xq 7Ab have the values given in table 1. For example, the 
quark content of an eta meson is given by 7 = S cos 6y + N sin 6y, where NV 
= Vi (ui + dd) and S= ss. Thus, the quantity AT is given by 3 sin? 0, where 
is the probability of producing a U quark (Gu pair) to combine with the initial u 
and ; sin? 6x4 is the chance that this tu pair actually forms the desired 7. 

For now we shall assume all primary mesons are pseudoscalar mesons with no 
decays. To compare with FF1, we find from table 1 with y= 0.4 


D(z) =D™ (2) + D™ (z) = 0.32 F@) + 0.40 fll -z), (2.31a) 
K,(@) = DK (z) + DK (z) = 0.16 F@) + 0.20 —z) , (2.31b) 
K,(z) = DK" (z) + DK (z) = 0.16 F@) + 0.40 (1 —z) ; (2.31c) 


in addition we have 
Di @) - Dy @)= 04 f(1 -z), (2.32a) 


D¥* @) — DE" @ = 0.201 -z), (2.32b) 


Table 1 


Values of the constants AG B!, where q is the quark flavor and i is the primary meson flavor 
nt(pt) 19(0) (op) K*(K**) K9(K*0) KK (K*7) KOCK *P) nw) n'@) 
Y hy 0 (1-27) 0 0 0 dy sin?o dy cos*Oy4 
0 hy ¥ 0 (1-27 +0 0 1, sin*@ 1) cos*@ 
3 Y, 37 sin” Oy x7 Cos” Oy 
0 0 0 0 0 ¥ ¥ (1 —2y) cos*6y (1 — 2) sin*oy 
7 7 7 1-27) 1-27) v1-2y) y-2y 7’ sin?o cos? Oy" 


+(1 —27)?cos?6y +(1 — 27)*sin26yy 


The quantity 614 is the mixing angle defined by n(w) = S cos 64 + N sin 6, n' (f) = —S sin 6yy +N cos 6m, where $= ss and N =Jhua + dd) 
and where we use 6x = Op, = 45° and 6x = 6y = 90° for the pseudoscalar and vector nonets, respectively. 
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DK @) — DE"@)=0.4f0.-2), (2.32c) 
DX* (z) = 0.5 Dt @), (2.32d) 
DK (z)=0.5 D7 @). (2.32e) 


These last two equations imply 
a + a5 
Di @D§"@)= Di @/Di'@). (2.33) 


which is what we assumed in eq. (3.7) of FF1. In addition, we have 
7 F@) 

wf(l—z)+7°F@) ’ 

which approaches zero as z becomes large. This was another of our assumptions in 

FF1 and here it is a consequence of the property that at z = 1, the primary meson 

must contain the original quark (i.e., F(z) = f(1 —z) at z = 1 so F(z) > 0). 

For this case, where all the primary mesons are pseudoscalars with no subsequent 
decay, the best choice for parameter a in (2.21) and (2.23) isa = 0.88. This is deter- 
mined in fig. 3 by requiring that the model (dashed curve) agree with the lepton 
data on the charged-particle distribution Dt (z) +Dh (z) and with the earlier FF1 
choice (solid curve). The value a is chosen so that DB” (@ = 1) + Dp (= 1)= 
0.6(1 — a) agrees with the FF1 curve at z= 1. 

The fraction of the total momentum carried by each type of primary meson is 
given by 


Dt (@)/DR @)= (2.34) 


fzPQ?@) d2= SqarvZ + Yaro(1 — 2), (2.35) 


where 
z= f2f( —z) dz (2.36a) 


is the mean momentum of the first hierarchy meson and is given by 
z= 3 peed 4a, (2.36b) 


for f(n) as in (2.21). The momentum carried by the various hadrons with a = 0.88 

is given in table 2. In FF1 we did not include the 7 and 7’ explicitly and, as can be 
seen from table 2, if we allow the 7 and 7’ to decay then the model reproduces the 
results of FF1 quite closely. 

It is of interest to ask how the charges of the primary mesons are distributed 
along the direction of the initial quark. Suppose we have some quantity such as 
charge Q, (or third component of isospin J3, or hypercharge Y) that can be calcu- 
lated for a primary meson state additively from the quark content of that meson 
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Vig. 3. Comparison of the charged particle distribution ph*@) + Dh (2) from the quark-jet 
model with @ = 0.88, aps = 1, ay = 0 (dashed curve) and with @ = 0.77, aps = ay = 0.5 (dotted 


curve) with the lepton data from fig. 6 of FF1 and with the distribution used in FF1 (solid line). 


(There is an error in the captions to fig. 6 and fig. 7a in FF1. The ep data from Daken et al. has 
been averaged over the two Q? bins 1.0 < Q? < 2.0 GeV? and 2.0 < Q? < 3.0 GeV? with 
12 <5 <30GeV.) 0 zM(ete~ > h*);0 zM(ep > h*); + zN(vp > h?). 


such that each quark flavor, “‘a”’, carries e, and each antiquark a contributes —e,. 
Then if we weigh each primary meson, ab, by its charge e, — e,, we obtain the net 
mean charge distribution of a quark jet. Namely, 


(Qa(@)0=24 (¢a ~ &5) PHP@)= (a — eq) (I -2), (2.37) 
from (2.29), where 

&q) = us Yala (2.38) 
is the charge of a “mean” quark. Thus the hadrons in each jet carry a total mean 


charge, ¢g — €q), equal to the charge on the quark producing the jet plus a constant 
error, —€qy, proportional to the deviation of the probability of production of new 
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Table 2 

Fraction of the total momentum of a u-, d- and s-quark carried by the primary mesons (before 
decay, called direct) resulting from our jet model with a = 0.88, aps = 1.0, and ay = 0.0. In addi- 
tion, the results are shown for the total mesons (direct + indirect) after the n and n’ are allowed 
to decay. For comparison, the values from table 2 of FF 1 are also given (see also table 6). 


a= 0.88, aps = 1 results FF 1 results 
u d s u d s 
at 0.23 0.12 0.12 
70 0.17 0.17 0.12 
direct nT 0.12 0.23 0.12 
total = 0.51 0.51 0.35 
n+n' 0.20 0.20 0.20 
nt 0.26 0.15 0.15 0.27 0.15 0.15 
70 0.24 0.24 0.18 0.21 0.21 0.15 
oa 0.15 0.26 0.15 0.15 0.27 0.15 
total x 0.65 0.65 0.48 0.63 0.63 0.45 
direct 
+ indirect Kt 0.11 0.06 0.06 0.13 0.08 0.08 
KO 0.06 0.11 0.06 0.08 0.13 0.08 
K~ 0.06 0.06 0.17 0.08 0.08 0.19 
ko 0.06 0.06 0.17 0.08 0.08 0.19 
total K 0.29 0.29 0.45 0.37 0.37 0.54 


Y 0.06 0.06 0.06 


pairs from the SU(3) value of 3. For the case of electric charge, we have e.g) = Y — 5 
and thus the mean total charge of a quark jet is 


(0,)= 1 —7=0.60 , (2.39) 
(Qa) = —y = —0.40 , (2.40) 
(Oj==72 040, (2.41) 


which is in close agreement with the empirically fitted values of 0.59, —0.40, and 
—0.39, respectively, found in FF]. The mean total /3 is just the /3 of the quark, 
independent of y since J3 of the mean quark is zero (u and d coming with equal 
weight). Hypercharge, Y is 2(Q — 3), so its mean for u- and d-quark jets is 0.2 and 
for s-quark jets is —0.8. 

The charge on the jet of a “mean” quark is Dg ¥q(@q — eq) = 0 so that the sec- 
ond and higher hierarchy primary mesons carry no charge on the average. Thus, 
were it not for complications from secondary disintegrations, which we discuss 
later, the experimental determination of the net mean charge, Dp, e,Dh (z), where 
ey, is the charge of the hadron h, would give a direct measure of the distribution of 
the first-rank primary meson, f(1 — z). 
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At first sight one might wonder why the total average charge of the hadrons is 
not that of the original quark, because after all, the new quark pairs qq are each 
neutral. However, as pointed out by Rosner and Farrar [11] and by Cahn and Col- 
glasier {12], for a jet starting from a quark, each new pair has its antiquark going 
into a hadron of one lower rank in the hierarchy than does its quark. The new anti- 
quark has, on the average, the higher momentum. Thus, when we consider all the 
hadrons higher than some very small momentum Zo (small enough to ensure that 
the mean number of positive and negative hadrons in dz are practically the same), 
we are counting more antiquarks than quarks. This makes no difference for isospin, 
since we have U and d antiquarks equally likely, but the average electric charge of 
antiquarks is not zero but is given by ecg) = (—3 + 3)y + 4(1 — 27). The § quarks 
are not in their full number, i, needed to cancel the average ti + d charge. Since, in 
counting charge, only mesons (we leave out the complication and slight numerical 
inodification due to baryon production) are counted, the total number of quarks 
and antiquarks counted must be exactly the same. Therefore, we must succeed in 
counting an excess of exactly one antiquark over quarks from the new pairs, to 
compensate the extra quark from which the jet started. Thus, the charges of all 
hadrons above Zo exceed that of the original quark e, by the charge of one mean 
antiquark e¢) = —€(qy (about -5 charge units). For this reason, the net charge of 
a “mean”? quark jet is zero. This last result can also be seen in the following way. 
tn a “mean” quark jet, all the mesons are formed out of pairs of “mean” quarks 
and “mean” antiquarks. Any tendency of the antiquark of one new pair to combine 
with a quark of lower rank to make a meson has no effect, for that lower-rank quark 
is also a “mean” quark with the same probability of flavors as the antiquark. In the 
general case, charge imbalance can be seen, in the mean, only in the hadron contain- 
ing the original quark as indicated by (2.37). 

We now turn to the problem of generalizing the double-decay function in (2.12) 
to include the effects of correlations in flavor. We define 


pibd(z,, z2) dz, dz = probability of finding a primary meson of flavor 
“ab” at z, and one of flavor “cd” at z, in a jet 
originated by a quark of flavor q when the primary 
meson at Z, has a larger rank in hierarchy than the 
one at Z,. (2.42) 


This probability is the sum of the following four pieces. 
(i) The probability that the primary meson at z, is of rank | and the primary 
meson at Z is of rank 2; given by 


dz 
Sb aaYnd oc Vaf(1 —2z,) dz, f(1 —z2/(1 —z,)) l a : (2.43a) 


For this term q =a and b is arbitrary with probability y,, but then c = b and d 
comes with probability yg. The probability that z, is rank-1 is f(1 —z,) dz,, and 
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then since 1 — z; momentum is left, the probability that z. is second-rank (i.e., 
rank-] in the remaining hierarchy) is f(1 — z2/(1 —z,))dz2/(1 —z,). 
(ii) The probability that z, is of rank one, but z, is of rank higher than 2: 
= dz, 
Saa%we VaS(1 — 21) dz, F@2/(1 —2z,) — 


i (2.43b) 
For this case, the chance to get cd is now independent of q, a, and b and is y,7q- 

(iii) The probability that the primary meson at z, is not first in rank but higher, 
but the primary meson at z, is directly of next rank to the one at z,: 


, dz dz 
Tarebvera f a(n) nfl —21/n) —* f— 22(n—21)) =, (2.430) 
~~) 


24+22 


where the early primary mesons leave momentum 7 (with probability g(7) dn) and 
the next two primary mesons come as rank one and two. Here the flavor of q has 
no affect, a and b come with probability y,y,, but c must equal b. We must, of 
course, integrate over all remaining momenta 7. 

(iv) Neither the primary meson at z; or Z, is first in rank, nor are they adjacent: 


dz 


N- 21 


1 
rarvrera f— a(a) dnf(l ~2,/n) + Falta ~ 24) (2.434) 


21+22 


The complete double-fragmentation function for producing two hadrons of flavor 
h, = ab and hy = cd is given by symmetrizing (2.43a—d) with respect to z, and Zz. 
Namely, 


Babsedg,, 2,)= Pa>dg,, 22) + PobHe,, 21). (2.44) 
2.6. The production of resonances 


Finally, we must decide in what nonet a primary meson is formed. We suppose 
that the pair of quarks qq is in the low pseudoscalar 0~ configuration with the pro- 
bability ap., that it is a vector meson |~ with probability a,, a tensor meson 2° 
with probability a,, etc. Then these objects are allowed to disintegrate as we know 
they should from the particle tables. 

We suppose that the choices of the probabilities a are dynamic and do not depend 
on the flavor. Although attempts have been made to determine from hadron exper- 
iments the production rate of, say, p° compared to 7° at larger p,, the experiments 
are difficult to interpret. Indications are that p° is roughly as likely as 7° at high p, 
(see references in table 16) and that higher resonances are less likely [13]. For 
definiteness for the present, we shall choose 


Ong = ay = 0.5, (2.45) 
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: ota ) ty 
with no higher resonances * (a; = 0, etc.), which yields D@ (z)/Dq (z)> 1 asz> 1. 
Future experiments may, however, indicate a better choice. 


2.7. The Monte Carlo method 


2.7.1. Recursive scheme 

The method of Monte Carlo generation of a complete quark cascade is clear. 
Suppose one starts with a quark of flavor q and momentum Wo. Then 

(i) One generates a value of 7, = 1 — z, at random with probability given by 
f(n) as in eq. (2.21). 

(ii) One generates a quark pair ud, dd, or s§ with probability y, y, and (1 — 2) 
(i.e., 0.4, 0.4, and 0.2), respectively. The first primary meson state is then of type 
qi, qd, or gS depending on the pair chosen. 

(iii) One decides on the spin-parity of the primary meson, according to (2.45), 
(i.e., pseudoscalar or vector with equal probabilities. 

The first primary meson is now of momentum (1 — 7,)Wo and of type qi, qd 
or qS depending on the choice made in (ii) and of spin depending on the choice 
made in (iii). This leaves, for the next step, a quark of type qz = u, d, or s (depend- 
ing on the first quark pair chosen) with momentum W, = 7, Wo. The cycle beginning 
with (i)—(iii) is then repeated for this quark q, with momentum W,. Another 7 
value is found and a new quark pair produced. This procedure is then repeated over 
and over until a desired point, to be discussed later (sect. 3 below), is reached. 

Finally, we add transverse momentum according to subsect. 2.7.2 and then let 
the vector mesons decay, each with its known characteristic kinematics and branch- 
ing ratios, as given in the particle tables. In addition, we allow the 7 and 7’ mesons 
to decay. 

As discussed earlier, in choosing the spin-parity of the primary mesons (iii), we 
assume Pat Qps and ay are idependent of flavor. For exantple, a uu primary meson 
state is a 1° abies probability 5 Qps, an 7 with probability 4 Ops sin? Ops, an n’ with 
probability ; 5 Ops cos? Op5,a p with probability 3 3%,a with probability 

5 Qty sin? 6, and a ¢° with probability } 3 Oy cos’ 6, in agreement with table 1. The 
Seidsealr and vector mixing angles 0p, and 0, are chosen to be 45° and 90°, 
respectively, 


2.7.2. Including transverse momentum 

The hadrons arising from the cascade or fragmentation of a quark do not travel 
in precisely the same direction as the initiating quark (i.e., z direction). We expect 
that the transverse momentum of these hadrons remains limited as the quark’s mo- 
mentum becomes large; however, we really have no knowledge as to the precise 
manner in which the various hadrons share the transverse momentum. One can 
imagine several ways to distribute transverse momentum among the primary mesons 


The theoretical idea that the quark spins combine randomly would recommend, rather, that 
ay = 3ap¢. 
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in our jets. We will choose a particular way, not because we are sure it is nature’s 
way, but for definiteness. Very few of the results in this paper depend on this par- 
ticular choice. 

We will incorporate transverse momentum into our model by assuming that the 
quark-antiquark pairs q;q; which are produced to discharge the color field conserve 
transverse momentum in a pairwise fashion and have no net transverse momentum. 
The quark q, in the ith pair is assigned a transverse momentum q,, and the antiquark 
the balancing momentum —q,,. The q 1, are distributed according to the Gaussian 
distribution 


exp(—q{,/202) d?q,. (2.46) 


Except for the first-rank primary meson, all primary mesons are given a transverse 
momentum that is the vector sum of the transverse momenta of the two quarks 
which form it. The first-rank primary meson is assigned a transverse momentum 
given by 


KiC)= Gi, — Up» (2.47a) 
the second-rank primary meson 

ki (2)= i, - 9, » (2.47b) 
and the rth-rank primary meson 

Ki@)=91,- 44,1: (2.47¢) 


The initial ¢,,, is also generated according to (2.46). This is to insure that the first 
primary meson has the same mean square transverse momentum as the others. 
The net result is to produce a cascade of primary mesons all of which have the 
same distribution of transverse momentum at fixed z, a Gaussian with a mean 
(k?) = 20% and 
Se pore mesons V43n0 ; (2.48a) 


with 
a= V20,, (2.48b) 


since two quarks contribute to each primary meson. This method introduces a cor- 
relation between primary mesons of adjacent rank, so that they tend to go oppo- 
sitely, the mean of (k,, + k,,) is —go*. There is no correlation between primary 
mesons whose rank is not adjacent. The physical ideas are reasonable enough, but 
the particular choice of (k,, * ki,))= —o* is a pure guess. It is made so that the 
total perpendicular momentum of the jet Kiet =ky, +ky, +..+ Ki, has amean 
square which does not rise with NV. 

In general, we could also contemplate another parameter giving the center of 
mass of the new pairs a Gaussian distribution. This has the affect of decreasing the 
negative correlation coefficient (k, , +k TPNALTEAL SEL between adjacent-rank 
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primary mesons from the value —} to any value minus or plus up to +} depending 
on parameters. In this case, extra rules would have to be added to keep the sum 
k ies within bounds, and rather than risk complications, we do not do this. 

Our main aim for adding transverse momentum is to study the differences 
between the true rapidity Y and Y, = —Inz and to investigate the effect of reso- 
nance decays on the distributions of transverse momentum. In addition, it will be 
interesting to see if there is any experimental indication that primary mesons adja- 
cent in rank have negative transverse momentum correlations. Unfortunately, as 
we will discover in sect. 5, although the correlation is close in hierarchy, it is spread 
over a wide range in rapidity due to the mixing of rank in hierarchy and order in 
rapidity, and is further obscured by correlations between mesons coming from the 
decay of the same primary. 

The value of the deviation o in (2.48a) is chosen so that the finally observed 
charged pions (those produced directly as primary mesons plus those resulting from 
secondary decays of primary vector mesons) have a mean ky, of 


(ky), = 323 MeV . (2.49a) 
This requires a choice of 

a = 350 MeV (2.49b) 
and results in 

MRD rey mesons — 439 MeV. (2.49c) 


The difference between the mean transverse momentum of the primary mesons and 
the resulting observed pions is due to resonance decay and the significance of this 
will be discussed in detail in sect. 5. 


2.7.3. Finite-momentum jets 

When discussing the Monte Carlo method of generating jets, (i)—(iii) in subsect. 
2.7.1, we avoided mentioning at what point one terminates the iterations. The pro- 
cedure, as outlined, would produce a jet of infinite momentum, an infinite number 
of particles, and an infinitely long rapidity plateau. For a jet of very large momentum 
P in the z direction, there is no problem. One merely continues to produce particles 
until all the available momentum P is used up. For large P, for any z not too small, 
Pz = 2P is large enough that questions such as the difference between energy 
E=/p2 +m? + P? and p, are not important. But what shall we do for real jets of 
finite momentum P? Even if P is a few GeV so that we can take it equal to the 
energy, for smaller z (say, z <0.2), one has ambiguities. In the plateau, we know how 
to resolve such ambiguities. The dz/z = dp,/p, behavior in the plateau comes from 
relativistic phase-space factors multiplying matrix elements which are slowly varying 
so that the dz/z should be replaced by dp,/E. But this is exactly d(E + p,)/(E + pz). 
We shall thus resolve all ambiguities by interpreting all variables &, W, etc., to refer 
not to momentum but rather to the quantity E + p, = p, t»/p2 +m? + P?. The 
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variable z in F(z) and f(1 — z) in eqs. (2.23) and (2.21) with n= 1 —z and similarly 
in all our equations refers to 


z= (E+pz)/(Eo * Pz)» (2.50) 


where Eo + Pz, is for the initial quark. Since the initial quark will always have fairly 
large momenta, we can replace Ey + Pz, by 2Po, where Po is the initial quark mo- 
mentum. Eq. (2.23) should be reasonable as long as pz is not too large negative. 

For energies large enough that a plateau region is developed, experience with 
inelastic hadron collisions shows us that scaling and limiting fragmentation hold 
approximately. This implies that a plot of hadron distributions versus rapidity 


Y= 3 In(E +p.)/(E-pz)) =n +pz)-Inm, , (2.5 1a) 
where 
m= m? + P?, (2.51b) 


has a simple property. The distribution of particles moving to the right in, say, the 
c.m.s., are the same for different energy collisions if only the origin of the rapidity 
Y axis is shifted by Yo, the rapidity of the original right-moving beam. This is equiv- 
alent to the principle that the distribution of particles moving near one end and into 
the plateau, but not near the other end of the distribution, for one energy can be 
obtained from that of another energy simply by a Lorentz transformation in the Z 
direction. This is because such a Lorentz transformation by a velocity v simply mul- 
tiplies all E + p, by a common factor, V/(1 — v)/(1 + v), and does not affect rapidity 
differences or E + p, ratios. We shall assume that the same is true of our quark jets. 

Jets of finite momentum are produced by first generating “master” jets of very 
large momentum Po. Each primary meson has a mass m;, a perpendicular momen- 
tum P, , generated according to subsect. 2.7.2, and an £; + pz, determined from 


E; +P2,= Zj(Eo + Po) > (2.52) 


where z; is generated by the procedures of subsect. 2.7.1. The primary mesons are 
then allowed to decay according to the rates in the particle tables and jets of finite 
quark momentum P, are produced by computing a new or scaled (E + pz)new by 


(E + Pz)new = E + Pz)oia (Eq + Py/Eo + Po) (2.53a) 
and keeping all final mesons such that 
@z)new 20. (2.53b) 


Resonance decays are relatively simple to handle under this scheme. The primary 
mesons determined according to (2.52) are allowed to decay and the Lorentz trans- 
formations of z; are easy as already mentioned. The sum of the z’s of the decay 
products are equal to the z of the parent. However, occasionally parents with 
Pz; <0 will produce a decay product with p, > 0. We are defining our jets to have 
only forward-moving particles and thus include such decay products with p, 2 0 
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Table 3 

Mean multiplicity of all particles (NV), charged particles (Voy), positive particles (M4) and nega- 
tive particles (W_) for u-quark jets of various energies Pq resulting from the jet model with 

a= 0.77 and apg = ay = 0.5. Also shown in the “correlation moment” fy = «NW — 1)) — wy, 


Py (GeV) 3.5 10 50 500 
(tot? 4.6 8.2 141 22.6 
Fi tae ~13 29 129 33.4 
(Noh) 2.7 4.8 RD 13.0 
go ~0.8 0.2 wa as 
(N4) 1.6 23] 44 65 
fy ~0.8 -1.1 1.2 09 
(N_) 1 2.1 3.8 6.2 
fy ~0.4 ~0.6 ~0.6 -0.3 


even though the parent was moving backward. Likewise, we ex.dude backward- 
moving secondaries even if the parent was moving forward. 

Real quark jets cannot have both the energy and momentuin °° 4 tree quark of 
definite mass *. However, it must be remembered that all quark 1¢1\ occur in pairs. 
In e*e™ collisions, two jets result from the qq pair produced. In sp reactions, one 
jet results from the quark “knocked out” of the proton by the neutrino and the 
other from the “hole” that was left behind. In hadron-hadron collisions at high p,, 
according to the model in FF1, we may think of a quark as being knocked out of 
the proton by another quark and “generating” a color field as the quark pulls away 
from the “hole” it left behind. This results in a four-jet structure as discussed in 
ref. [5] (hereafter called FFF). Energy and momentum are conserved in the two- 
jet system not for the single jet. Hence, any quantity like energy or k, that is not 
conserved in the single jet is balanced by its oppositely moving partner. 

Our quark jets do not precisely conserve energy or momentum. For example, 

a Pg = 10 GeV quark jet produced in the above manner has mean values (E;o¢) = 9.8 
GeV and (pz,,,) = 8.6 GeV, where Eo and pz,,, are the total energy and p, of all 
the hadrons in the jet. The mean value of Exot + Pz,¢ does approach 2P, at high 
momentum. Rather than compensate for the fact that our cut-off procedure yields 
Erot + Pztot less than 2p; at finite P,, we can interpret our jets as having an energy 
equal to P, since this comes out closely the same without any adjustment. For all 
P,, the mean value of Eto, — P2toy is about 1.2 GeV, and (ki ie? is about 610 MeV, 
where k Liat 18 the total perpendicular momentum of all the hadrons in the jet. 

Finally, in table 3, we show the particle multiplicities resulting using the above 


* If one selects £ + pz of all hadrons in the jet to equal the £ + pz of a large momentum initial 
quark, then £ — pz is equal to the mean density of particles in the plateau times the m, of 
the particles. See the discussion on p. 246 of ref. {14}. 
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cut-off prescription. The table gives the mean multiplicity of all particles, charged 
particles, positive particles, and negative particles for a u-quark with energy 
P, = 3.5, 10.0, 50.0, and 500.0 GeV, arrived at with a = 0.77 and Ops = ay = 0.5. 


2.8. The analytic approximation 


Although the Monte Carlo method is straightforward, some jet observables 
require the generation of a large number of typical jets in order to get sufficient 
Statistical accuracy. In addition, it is very handy to have analytic formulas for easy 
analysis and understanding and so the reader can, without writing a Monte Carlo 
program, reproduce most of our results. For these reasons, we present approximate 
analytic solutions for those observables where analytic methods are not too diffi- 
cult. Details of individual resonance decay are too tedious to handle exactly ana- 
lytically. What we have done is to assume all products of vector meson decay are 
distributed as they would be in a two-body isotropic decay of massless products; 
that is spread uniformly in z from 0 to the z of the parent. Thus for all vector 
decays, we assume the daughter-parent relationship 

1 


G@)=2 f Gp) ay, (2.54) 


Zz 


where G(z) is the distribution in z of a daughter of parents distributed in Ge). The 
integral of (2.54) is easy for G(z)= F@) or G@)= f(1 —z), since if G@)= z” then 
G(z)= 2(1 — z”)/n forn #0 and —2 Inz forn=0. 

The total distribution of hadrons is 


Frotal@) = ApsF@) + oF) , (2.55) 


and the distribution of the first-rank meson is 
frota(l —2) = Ops f(1 — z) + ay fl — 2), (2.56) 


where we assign all the decay products the rank in hierarchy of their parent primary 
meson and where dps and ay, given by (2.45), are the relative strengths of the pseudo- 
scalar and vector meson component. The values of apszF(Z), &psf(1 — z), ayzF(z), 
and a, f(l ~ Z) are given in fig. 2b with a = 0.77. As can be seen in fig. 3, the choice 
a= 0.77, Ops = dy = 0.5 produces a good fit to the lepton data on the charged par- 
ticle z distribution and compares well with the FF1 result. 

The distribution of hadrons of flavor h from a quark q is given by 


D2 (z)= ape[Ah fl —z) + BP F()] + ay [ADA —z) + BF()] ; (2.57) 


where F(z) is the probability that the meson at z has any rank greater than one and 
is defined by (2.29b). (Similarly, F@)= F(z) — f(i —z).) The constants At and Bh 
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are given by 


9 

Abh= 20 BsnAl , (2.58a) 
i=1 
9 

Bh = 20 BronB!, (2.58b) 


where the sum is over all nine vector mesons and §;-,,, is the probability that one 
finds a daughter of type has a decay product of a parent of type 7. The constants 
21° and B® are as in (2.30) and are given in table | (for vector production one uses 
Oy, = 6, and for pseudoscalar production one uses 64 = 8ps). The p and K* reso- 
nances decay 100% of the time to m7 and Kz, respectively, and the 8; are easily 
calculable from isospin symmetry. The values used for B;.y for w and ¢ are given 
in table 4, where we approximate three-body decays by the two-body decay for- 
mula (2.54) but with 6;.,, adjusted to give the correct frequency into the various 
mesons. For example, we take w > a*2~7° and w > 7° with a 90% and 10% rate, 
respectively, so that B,,+n+ = $(0.90) and By,+n0 = 4(0.90) + $(0.10). 

In addition, to agree better with the Monte Carlo results, we have also allowed 
the n and 7’ to decay. These decays are handled in the same manner (2.54) as the 
vector meson w decay but with the 6;.,, given in table 4. 

Fig. 4 shows a comparison of the analytic approximation and the more precise 
Monte Carlo method. Agreement is good except for the pions resulting from the toe 
decay (similarly for three body n and n’ decay modes) which have been approxi- 
inated by the two-body decay formula. The approximation for K*° > K* does not 
agree as well as p° -> 1* due to the heavier mass of the K. 

The inclusion of resonance decays into the analytic formula for the double- 
decay distribution pabed(z, ,Z2) given by (2.42) and (2.43) is straightforward. One 


Table 4 


Probability 8;,, to find a daughter meson of type h among all the decay products of a parent 
meson of type i 


Parent n n w ¢ 

Daughter 

a 0.09 0.27 0.30 0.06 
70 0.40 0.24 0.35 0.06 
a 0.09 0.27 0.30 0.06 
k* 0 0 0 0.24 
K° 0 0 0 0.17 
K~ 0 0 0 0.24 
Ko 0 0 0 0.17 


For? = p and K*, Bj-+p is given by isospin symmetry assuming p > mm and K* > Kn 100% of 
the time. 
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Fig. 4. Compartan of the Monte Carlo results (@ = 0.77, apg = ay = 0.5) for some of the quark- 
decay functions, Dh@), with the analytic approximation (solid ene): The number of p9, w9, 
¢, and K*O from a u-quark are shown together with the number of n* that have decayed from 
p° and w9 and the number Kt from ¢ and K*9. (a) +zD(u— p°); x zD(u> p0 > n*); (b) + 
zD(u > w°); x z<D(u > w9 > 2"); (c) + zD(u> ¢); x ZD(u > > K*); (d) + zD(u> K*9); 

x zD(u > K+9 > K*), 


merely replaces the F(z) and f(1 —z) functions in (2.43) by the Fora) and 
Frotai(l — z) functions in (2.55) and (2.56). From this Dine (z,, Zz) is computed in 
a manner analogous to (2.57). In addition, however, one must add the contribution 
from the case where both the mesons at z, and z2 came from the decay of the 
same resonance (type h,) given by 


h 
By, ,hyPq’ @1 +Z2)/(Z1 +22), (2.59) 


where D av) is the single-particle distribution of the parent hy and By,-+n,,n, in its 
bianching ratio into h, = ab and hy = cd. 
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Table 5 


Description of the parameters used in the model of quark jets and an explanation of how they 


are determined 


Parameter Description Determination 

a Parameter used to describe the func- Require that resulting charged 
tion f(n) = 1 ~ a + 3an?, which is the distribution Dh*(z) fit the data 
probability that the first-rank primary shown in fig. 3. 
meson leaves a fraction of momentum 
n to the remaining cascade. 

Y The probability of producing a uu, dd, Use the fact that at large p, in pp 
and ss pair is given by y, y, and(1 —2y), collisions Kt/nt = . Requiring 
respectively. DK*(2)/DT*@) 7 i asz— 1 then 

implies y = 0.4. 

Ops, %y Relative probability of producing a Experiments indicate that p9/n9 
pseudoscalar meson or a vector meson = 1 at large py in pp collisions so 
(aps + ay = 1). we choose apg = ay = 0.5. 

o Determines the (k,) of the produced Require (k\)_ near 330. Means 
primary mesons. (Kk primary = 439 = Vin o. 

Ops Oy Pseudoscalar and vector nonet mixing Use @ pg = 45° and 6, = 90° as 
angles where n(w) = Scosé+WN sin 6, simple numbers close to experi- 
n'(¢) =_-S sin 6 + N cos 6, with S = ss, ment. 

N=/Mud + dd). 
Bish Probability to find a daughter meson Use branching ratios in particle 


of type h among all the decay prod- 
ucts of a parent vector meson of type 
i. 


tables. 


Our model of quark jets is thus completely determined by the function f(n), 


which we have parametrized by a parabola with one parameter, a, and the three addi- 
tional parameters; the degree that SU(3) is broken in the formation of new qq pairs 
(y in (2.24)), the spin-parity nature of the primary mesons (ap, a, in (2.45)) and 
the mean transverse momentum of the primary mesons (o in (2.48a)). From this, all 
the properties of quark jets follow. The parameters of the model together with a 
review of how they were determined is given in table 5. We now proceed to examine 
the results of the model in detail. 


3. Properties of the ‘‘end” of the quark jet 
3.1. Single-particle distribution Dd} (z) 


Figs. 5, 6, and 7 show the single-particle quark-decay functions, pb (z), resulting 
from our new jet model with a = 0.77 and a, = a, = 0.5 together with the analytic 
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Fig. 5. The quark-decay functions (a) zD(u—> ht + h7, z); (b) zD(u> ht, z), + zD(u +h’), 
x zD(u — h7); (c) z<D(u > nt, 2), + zD(u> n+), x z<D(u> 17); and (d) zD(u> K+, z), 
+zD(u > K*), x z<D(u> K~), resulting from the quark-jet model with a = 0.77 and Ops = ay 
= 0.5 (points). Also shown is the analytic approximation (dotted curve) and the results ob- 
tained in FF1 (solid curves). 


approximation (dotted curve). For comparison, the results arrived at in FF 1 are also 
displayed (solid curves). The differences between the analytic calculations and the 
Monte Carlo results are because in the analytic approximation, we have considered 
all decays as massless two-body decays. For instance, as seen in fig. 4 the w > 3a 
decay is not treated correctly in the analytic approximation. Nevertheless, the two 
methods agree quite closely. 

The decay functions in figs. 5, 6, and 7 agree well for the most part with the 
results from FF1. There are, however, the following notable differences. 

(i) The new decay functions have 


DK*(@y/D™ (z) > 0.2, (3.1) 
: z>0 
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Fig. 6. Same as fig. 5 but for a d-quark. 


whereas the FF1 results yield a value of 0.8. Both have 
DS @)/Di'@) > 0.5, (3.2) 
zt 


hy construction, since 7 = 0.4. The difference at small z is due to the many K” reso- 


nances that decay into kaons and pions. The new results are probably more reason- 
able than the FFI values. 


(ii) The new approach yields 
D§'@) >DE@), (3.3) 


whereas in FF] we assumed equality for simplicity. However, as discussed in FF1, 
(3.3) is expected. 


Table 6 gives the total fraction of momentum carried by the direct (primary) 
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Fig. 7. Same as fig. 5 but for an s-quark. 


mesons and by the final hadrons (direct + decay) for a u-, d-, and s-quark. These 
results can be compared to the FF1 values shown in table 2. 

In spite of the differences at small z between our new results and the FF] results, 
they are not very different for large z (¢ > 0.6). Since the large-p, single-particle 
predictions made in FF1 and most of the two-particle correlation results obtained 
in FFF (ref. [5}) are only sensitive to the large-z region of the quark decay func- 
tions, most of the predictions made in FF1 and FFF remain unchanged. Changes 
and improvements of the results of FF1 and FFF will be discussed in sect. 6. 


3.2. Approach to the rapidity plateau 


Figs. 8, 9, and 10 show the number of hadrons, charged hadrons, positive hadrons 
and negative hadrons per 0.1 unit of Y, for a u-, d- and s-quark, respectively. We 
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Fable 6 

{ traction of total momentum carried by the direct-primary (before decay) mesons and the 
Jdurect-plus-indirect (from a decay) mesons resulting from a u-, d- and s-quark in our jet model 
with a = 0.77 and aps = ay = 0.5 


Particle u d $ 
nt = pt 0.12 0.06 0.06 
10 = pO 0.09 0.09 0.06 
7m =p 0.06 0.12 0.06 
Kt =K*t 0.06 0.03 0.03 
KO = K*0 0.03 0.06 0.03 
Peet K7=K*~ 0.03 0.03 0.09 
KO = KrO 0.03 0.03 0.09 
n 0.05 0.05 0.05 
n’ 0.05 0.05 0.05 
w 0.09 0.09 0.06 
o 0.01 0.01 0.04 
nt 0.29 0.19 0.19 
70 0.26 0.26 0.20 
total ™ 0.19 0.29 0.19 
direct Kt 0.08 0.06 0.06 
*untdirect Ko 0.06 0.08 0.06 
K~ 0.04 0.04 0.13 
Ko 0.04 0.04 0.13 
¥ 0.04 0.04 0.04 
total 7 0.74 0.74 0.58 
total K 0.22 0.22 0.38 
0.5 
0.4 
ct 
co 03 
" 
| 
Mi 0.2 
<j 
0.1 
~ : 
fx] (0.0 bat 
A, 
we 0.12 
fa 
Ss 0.09 
=) 
= 0.06 
0.03 
0.00 
0 


lig. 8. The number of particles, charged particles, positive particles and negative particles per 
4Y,=0.1 resulting from a u-quark jet with a = 0.77 and aps = ay = 0.5, where Y, =—-Inz. 
(a) + all particles; x charged particles; (b) + positive particles, x negative particles. 
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NUMBER PER AY,=0.1 


Fig. 9. Same as fig. 8 but for a d-quark jet. 


=0.1 


NUMBER PER AY, 


Fig. 10. Same as fig. 8 but for an s-quark jet. 
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iy. 11. Comparison of the number of charged particles per AY, = 0.1 versus Y, with the num- 


ber of charged particles per AY = 0.1 versus Yeng — Y for a u-quark jet, where Y, = ~—Inz and 
Y ws the true rapidity (2.51) and Yeng is given by (3.6b) with My = 380 MeV. a = 0.77, aps 
ay = 0.5, 0¥ = Yend ~ Y, xy = Yo. 


have defined the “z rapidity”, Y,, by 
Y, =—-Inz = —In(E +p,)/2Pq) - (3.4) 


‘This is a convenient variable for us since it does not depend on the generated value 
of the perpendicular mass, m_, in equation (2.51b). The usual rapidity, Y, is given 
by (2.51a) and hence 


Y= ~—Y, +In(2P,/m.) , (3.5) 


sa that particles at the same Y, may be found spread a bit in Y because of differ- 
clices of mass and transverse momenta. It is usual to measure Y from some Yeng at 
the high-momentum end of the jet. For this one might expect to take In(2Pq/m4q), 
where mg is the mass of the original quark, but this has no clear meaning. Alter- 
natively, data could be plotted using 


Ymax = In(2Pq/mz) (3.6a) 


for ¥qq since this is the maximum rapidity a pion of the jet could have. These dif- 
ferences are, of course, only shifts of scale of Y for convenience in plotting. In fig. 
1, we compare the Y.nqg — Y and Y, distributions, where we have chosen 


Yen = Ymax — In(m, 5/mtx) , (3.6b) 


with man arbitrary number chosen in an attempt to make the two distributions 
agree. The distributions are nearly identical with m,, = 380 MeV, the Yeng — Y dis- 
tribution being slightly steeper. (This value is close to the mean m,.) 

The model yields about 2.2 charged particles per unit rapidity in the plateau with 
qual numbers of positive and negative particles. (Some multiplicities are given in 
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table 3.) Towards the “end” of the quark jet (Y, < 3), on the other hand, the num- 
ber of positive and negative hadrons is not the same and one can hope to discrim- 
inate among the quark flavors by observations in this region of Y,. 


3.3. Distribution of charge 


Any single-quark jet has an integral number of hadrons and an integral charge. 
The average total charge of a u-, d- and s-quark jet of infinite momentum is nearly 
the charge of the quark (for y = 0.4 we have 0.6, -0.4, and —0.4 as shown in (2.39)). 
Table 7 shows the number of times various total jet charges Q occur for 10.000 jets 
with energy P, = 10 GeV. Jets initiated by u-quarks are more likely to have 
positive total charge than those initiated by d-quarks. For example, at Pg = 10 GeV, 
Q = +2 jets occur ~4 times more often for u-jets than for d-jets and Q = +4 jets occur 


Table 7 


Number of times various total quark-jet charge Q occurred for 10000 quark jets with energy 
Pq = 10 GeV 


Charge Q u d s 

~4 7 39 42 

-3 42 231 227 

2 307 1094 1115 

-1 1291 2972 3009 
0 3118 3675 3661 
1 3460 1576 1483 
2 1398 340 390 
3 299 62 60 
4 64 6 7 

(Q) 4) 0.54 -0.36 -0.36 


Same as above but observe only those hadrons with z > 0.1 


Charge Q u d s 

—4 2 0 6 

-3 20 88 74 

—2 258 710 781 

-1 1442 2969 3172 
0 3560 4123 4091 
1 3639 1742 1606 
2 967 342 255 
3 106 23 15 
4 6 3 0 

(Q) 0.39 —0.21 —0.28 


4) For infinite momentum quark jets, the mean charge, (Q), is 0.60, —0.40, —0.40 for au-, d- and 
s-jet, respectively. 
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about 10 times more frequently. The average total charge, (Q), for a Pg = 10 GeV u- 
and d-jet is 0.54 and —0.36, respectively. Because of the finite total momentum 
available (Pg = 10 GeV) summing only z > 0 does not yet yield the full total mean 
charge 0.6 and —0.4 expected. There is some overlap of charge into negative z. An 
interesting point is that the mean total jet charge (Q) is not precisely the same for 
each fixed multiplicity. As table 8a shows, the odd charged multiplicity jets Woy = 1, 
3, etc.) have a mean charge (Q) that is somewhat larger than the jets of even charge 
multiplicity (Non = 2, 4, etc.). For an extreme example, for u-quark jets having only 
one charged particle (Vy = 1), that particle is positive about 13 times more often 
than it is negative ((Q) = 0.86). 

Unfortunately, total jet charge is often a difficult quantity to measure experimen- 


AMOUNT PER AY,=0.1 


0 1 2 3 4 5 6 


lig. 12. Distribution of (a) charge Q, total Q = 0.6, (b) third component of isospin /3, total 3 
= 0.5, and (c) strangeness S, total S = 0.2, along the Y, (Y, = —Inz) axis for a u-quark jet. Also 
shown are the analytic results (solid curves). @ = 0.77, Ops = ay = 0.5. 
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Table 8a 


Number of times various total qu a:x-jet charge Q occurred for 10 000 jets with momentum 
Pg = 10 GeV, where Np is the charged multiplicity 


u d s 
Q@>0 5232 1984 1941 
all Qo<0 1650 4341 4398 
jets Q=0 3118 3675 3661 
(gy) 0.54 ~0.36 —0.36 
Q>0 420 102 81 
Neh = Q<0 31 287 341 
jets (Q) 0.86 ~0.48 ~0.62 
Q>0 175 25 37 
Nen = 2 Q<0 13 118 147 
jets Q=0 591 799 839 
Q) 0.42 0.20 —0.22 
Q>0 1029 423 427 
Neh = 3 
iets Q<0 266 883 937 
Q) 0.64 ~0.38 -0.41 
Q>0 1462 346 397 
Neh = even Q<0 314 1133 1157 
jets (Nop ¥ 0) Q=0 3052 3535 3512 
<Q) 0.50 ~0.33 -0.31 
>0 3770 1638 1543 
Non = odd Q 
welt ig O<0 1336 3208 3241 
(Q) 0.58 ~0.40 ~0.43 


8) For infiniteemomentum jets (Q) = 0.60, —0.40 and —0.40 for a u-, d- and s-quark, respec- 
tively. 


tally. One can never be sure that all the jet particles have been included and that one 
has not included extra background particles. A more tractable quantity experimentally 
is the total charge of all those hadrons in a jet whose z value exceeds some threshold. 
As shown in tables 7 and 8b, charges defined for Py = 10 GeV and z 2 0.1 have the 
same characteristics as the complete jet but the magnitudes are smaller. The average 
total charges (z > 0.1) are 0.39 and —0.2! for a u- and d-jet, respectively. 

Figs. 12, 13, and 14 show the distribution of charge Q, third component of isospin 
I, and strangeness S (of course, S = 2(Q —J3)) along the Y, axis for a u-, d- and s- 
quark, respectively. As discussed in subsect. 2.5 and given by (2.37), these quantities 
are related to the distribution of the first-rank primary meson, f(1 —z), and the inte- 
grated values are given by (2.39), (2.40), and (2.41) (with S= Y since we have no 
baryons among the resulting hadrons). Also shown in these figures is the analytic 
approximation. As can be seen, the charge, isospin, and strangeness are distributed 
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Table 8b 
Same as table 8a except only hadrons with z > 0.1 are observed 


u d s 

Q>0 4718 2110 1876 

all O<0 1722 3767 4033 

jets Q=0 3560 4123 4091 
(Q) 0.39 20:21 ~0.28 

Wo. 34 Q>0 2624 1205 1090 

eae Qo<0 936 2120 2272 
j (Q) 0.47 ~0.28 ~0.35 

Q>0 871 309 240 

Neh = 2 Q<0 229 651 713 

jets Q=0 2383 2557 2402 
(Q) 0.37 ~0.19 ~0.28 

ets Q>0 1101 551 522 

‘il Q<0 512 918 949 
J Q) 0.47 ~0.33 ~0.37 

Q>0 973 345 255 

Neh = even Q<0 260 710 787 

jets (Nch # 0) Q=0 2607 2792 2643 
(Q) 0.37 ~0.19 ~0.29 

_ Q>0 3745 1765 1621 

a oad O<0 1462 3057 3246 
(Q) 0.47 ~0.29 —0.36 


over a considerable range in Y,. The quark quantum numbers are spread over almost 
4 units of Y,, which will make it difficult to determine them experimentally. 

One of the most important experimental questions about high-p, hadron colli- 
sions is whether the jets really come from quark cascades as we have supposed in FFI 
and FFF, or possibly from other types of objects such as gluons or di-quarks. We have 
calculated the flavor of the quarks to be expected under various circumstances (see 
fig. 25 of FFF). When the characteristics of the quark jets of definite flavor in lepton 
experiments are known, the details can be checked against the jets observed in hadron 
experiments. But what should we measure to most readily identify the flavor of a 
quark jet: the total charge, the charge of the fastest hadron, of the fastest two? We 
have used our model for a “standard” jet as a kind of laboratory of typical jets to 
test various ideas. The following sections on the flavor properties of the jets, there- 
fore, contain very detailed information from these studies on various quantities 
which have been selected because they are easy to measure experimentally or because 
they are expected to differ as much as possible for u- and d-flavor jets. 
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Fig. 13. Same as fig. 12 but for a d-quark jet. (a) Total Q = —0.4, (b) total /3 = —0.5, (c) total 
S=0.2. 


3.4. Fastest-particle analysis 


It is interesting to ask about the fraction of momentum, z, carried by the fastest 
hadron (largest z) when a quark fragments into hadrons in an unbiased fashion. Figs. 
15 and 16 show the z distributions of. various hadrons fragmenting from a u-quark 
of energy 10 GeV resulting from the model where, by first and second, we 
mean fastest (largest z) and second-fastest (next-largest z). These figures and table 9 
show that for a u-quark jet, the fastest hadron carries on the average only 39% of 
the quark momentum and the fastest charged hadron only 30%. Fig. 17 and table 10 
show that for a u-quark, the fastest charged hadron carries 54% of the total charged 
momentum with the second fastest charged particle taking about 21%. (All charged 
particles carry on the average about 57% of the u-quark momentum.) The fastest two 
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=0,1 


AMOUNT PER AY, 


fi. 14. Same as fig. 12 but for an s-quark jet. (a) Total @ = —0.4, (b) total /3 = 0.0, (c) total 


Y= (8, 


‘Table 9 


= eV 
Mean values of z for hadrons fragmenting from a u-, d- and s-quark of energy Pqg=10G 


0.04 
0.02 
0.00 


fastest hadron 
2nd-fastest hadron 
3rd-fastest hadron 


fastest charged 
2nd-fastest charged 
3rd-fastest charged 
all charged 


fastest positive 
2nd-fastest positive 
3rd-fastest positive 


fastest negative 
2nd-fastest negative 
3rd-fastest negative 
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Fig. 15. The predicted z distributions of various hadrons in a u-quark jet with Pq = 10 GeV, 
where first and second refer to the fastest (largest) and second fastest (next largest-z), not 

to rank in hierarchy. @ = 0.77, Ops = dy = 0.5. (a) + First particle ((z) = 0.39), X second particle 
((z) = 0.18); (b) + first charged ((z) = 0.30), x second charged ((z) = 0.12), (c) first two particles 
((z) = 0.57), (d) first two charged ((z) = 0.42). 


Table 10 
Mean values of zp for hadrons fragmenting from a u-, d- and s-quark of energy Pq = 10 Gev, 
where ZR = 2/Z¢h and Zep is the sum of the z of all charged hadrons 


u d Ss 
Fraction of jets with at 
least one charged particle 0.993 0.986 0.985 
fastest charged 0.54 0.53 0.53 
2nd-fastest charged 0.21 0.22 0.22 
Fraction of jets with at 
least one positive particle 0.988 0.943 0.932 
fastest positive 0.45 0.31 0.30 
2nd-fastest positive 0.12 0.08 0.08 
Fraction of jets with at 
least one negative particle 0.929 0.973 0.972 
fastest negative 0.27 0.41 0.42 


2nd-fastest negative 0.07 0.11 0.11 
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FAST HADRON DISTRIBUTIONS(u—QUARK) 
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lig. 16. Same as fig. 15, continued. (a) + First positive ((z) = 0.25), X second positive ((z) = 0.07); 
(b) + Pirst negative ((z) = 0.15), X second negative ((z) = 0.04), (c) all charged ((z) = 0.57), 
(d) + First two positive ((z) = 0.32), X first two negative ((z) = 0.19). 


charged hadrons in a u-quark jet carry about 75% of the total charged momentum. 
Roughly, the fastest charged hadron takes, on the average, half the charged momen- 
tum: the second fastest charged hadron takes half the remaining charged momentum, 
etc. This feature of the model is in agreement with experimental observations of jets 
produced in yp interactions [15] as seen in fig. 19. 

We might have expected, from the point of view from which we began, that the 
primary meson containing the original quark would be likely to have a large fraction 
of that quark’s momentum. Yet the function f(1 — z) in eq. (2.21) shows this pri- 
mary meson to be more likely to have less than half of the initial quark’s momentum. 
In fact, from (2.36) the average is less than a third of the jet momentum (fora = 0.77). 
Because of this, there is a considerable difference between rank in hierarchy and order 
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FAST HADRON DISTRIBUTIONS(u—QUARK) 
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Fig. 17. The predicted zp distributions of various hadrons in a u-quark jet with Pg = 10 GeV, 
where Zp is the relative fraction of charged momentum defined by zp = 2/Z¢p where Zp, is the 
sum of the z’s of all charged particles in the jet. As in figs. 15 and 16, first and second refer to 
order in z not to rank in hierarchy. a = 0.77, Gps = ay = 0.5. (a) + first charged (ZR) = 0.54), 

X second charged (zp) = 0.21), (b) first positive ((zp) = 0.45), x first negative ((zp) = 0.27), 
(c) first two charged ((zp) = 0.75), (d) + first two positive ((zp) = 0.57), X first two negative 
(ZR) = 0.34). 


Table 11 


Number of times the fastest (largest-z) hadron occurred with rank r (7 = 1, ..., 5) for 10000 
u-quark jets 


Rank r Unbiased z20.5 z 20.75 
1 4267 1345 340 

2 2566 564 97 

3 1506 197 23 

4 795 56 3 

5 436 17 0 
Mean rank 2.3 1.7 1.4 


All the decay products of a particular primary meson are assigned the rank of that meson. 
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FAST HADRON DISTRIBUTIONS(d-—QUARK) 
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Fig. 18. Same as fig. 17 but for a d-quark jet. (a) + first charged ((zp) = 0.53), X second charged 
(ZR) = 0.22), (b) + first positive ((zp) = 0.31), X first negative ((zp) = 0.41), (c) first two 
charged (ZR) = 0.75), (d) + first two positive (zp) = 0.39), X first two negative ((zp) = 0.52). 


in rapidity (or z). It is relatively easy for the first-rank primary meson to leave the 
remaining cascade with a large momentum and thus to end up with a momentum less 
than some subsequent (higher-rank) meson. Table 11 shows that the fastest hadron 
from an unbiased u-quark jet is the first-rank meson only 43% of the time. However, 
as one selects on events where the z of the fastest hadron is large, it becomes more 
likely that it is of rank-one in hierarchy. For z 2 0.75, this probability increases to 
73% becoming 100% at z= 1. 

An interesting property of a jet, that has not yet been tested experimentally, is 
how the fraction of momentum carried by the fastest hadron in a jet depends 
on the charge of that hadron. In our model, for a u-quark, the first-rank primary 
meson can be positive or neutral, not negative. A negative hadron can arise only 
further down the hierarchy chain (rank greater than one) or from the decay ofa 
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FAST HADRON DISTRIBUTIONS 


Fig. 19. Comparison of the predicted distributions of zp for the fastest, second-fastest and 
fastest two charged hadrons in a u-quark jet (points) with data from vp > yp + jet + X (ref. 
{15}, solid curve), where zp is the relative fraction of charged momentum as in fig. 17. (Note 
that in this figure, the theory are the points with errors and the data are represented by the 
solid curves. ) @ = 0.77, aps = ay = 0.5, Pg = 10 GeV. x first charged, © second charged, + first 
two charged, —— neutrino data. 


rank-one neutral resonance. Because of this, the fastest positive hadron in a u-quark 
jet carries, on the average, a larger portion of the total momentum (45%) than does 
the fastest negative hadron (27%). Results of this type are tabulated in table 10 and 
illustrated in figs. 17 and 18. The fastest two positives in a u-quark jet carry 57% of 
the charged momentum, whereas only 34% is carried by the fastest two negatives. 
For a d-quark jet, the situation is reversed (see fig. 18); the fastest two negatives 
carry 52% of the charged momentum compared to 39% for the fastest two positives. 
These predictions should be easy to check by comparing the jets produced in 

vp > po” + jet which are predominantly u-quark jets to those produced in ¥p > p* 

+ jet which are predominantly d-quark jets. 

A similar interesting and easily measured observable is the flavor of the fastest 
(largest-z) hadron. As table 12 shows, for a u-quark jet the fastest hadron is posi- 
tive 42% of the time and negative only 20% of the time; the rest of the time, it is 
neutral. For a d-quark, on the other hand, the fastest hadron is positive 24% of the 
time and negative 34% of the time. The average charge of the fastest hadron in a u- 
and d-quark jet is about 0.23 and —0.10, respectively. The fastest hadron does 
“remember” to some extent the flavor of the quark that initiated the jet. Table 12 
also shows the relatively large portion of strange particles predicted in our jets. For 
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Table 12 

Number of times the various hadrons occured as the fastest (largest-z) particle for 10 000 
quark jets with energy Pg = 10 GeV. Also shown are the average charge Q, third compo- 
nent /3, and strangeness S of the fastest hadron 


Fastest hadron u d y 

nt 3194 1645 1447 

nO 2411 2462 1536 

«7 1511 2888 1356 

kt 1055 107 528 

KO 600 1062 517 

K~ 466 473 2165 

KO 458 458 2174 

y 305 305 277 

ht 4249 2352 1975 

h7 1977 3361 3521 

n? 3774 4287 4504 
ht/h~ 2.15 0.70 0.56 
ave Q 0.227 ~0.101 0.155 
ave 13 0.191 ~0.143 0.010 
ave S 0.073 0.084 —0.329 


a u-quark jet, the fastest hadron is a strange particle (K*, K°, K~, K°) about 26% 
of the time. 


35. Correlations between the fastest two charged hadrons 


Although there is a considerable difference between rank in hierarchy and order 
in rapidity (or z), the hadrons in our jets do not completely “forget” the underlying 
hierarchy structure shown in fig. 1. The charge correlations between the fastest two 
charged hadrons, where h, is the fastest charged hadron and h, is the second fastest 
charged hadron and where neutrals are ignored, shown in tables 13a and 13b are a 
result of the underlying hierarchy structure and resonance decay effects. The combi- 
nation h, = +, hy = — occurs more often than h, =+,h, =+andh, =—,h, =+ 
occurs more often than h, = —, hz = —. In fact, for a u-quark jet, the unlike sign 
configuration occurs almost twice as frequently as the like sign configuration. It is 
very unlikely for two negatives (positives) to occur as the fastest two charged par- 


ticles in a u-quark (d-quark) jet (—— = 7% for a u-quark and ++ = 10% for a d-quark). 


3.6. Determining the flavor of the quark 


As we have seen, there are differences in the distribution of charge, total charge, 
fastest hadron, and correlations between the two fastest charged hadrons depending 
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Table 13a 


Number of times various charge combinations occurred for the fastest (largest-z) two-charged 
hadron hy, hy (hy = fastest charged hadron, h2 = second fastest charged hadron) that frag- 
mented from 10000 quark jets with Pg = 10 GeV 


hy hy u d S 

+ + 2594 965 845 
+ - 3608 3024 2743 
+ none 420 102 81 
+ arbitrary 6622 4091 3669 
& + 2550 3515 3648 
- - 731 1967 2193 
- none 31 287 341 
- arbitrary 3312 5769 6182 
none none 66 140 149 
Same as above butzp, > 0.5 

hy hy u d S 

+ + 395 61 48 
+ - 588 368 233 
+ none 151 11 4 
+ arbitrary 1134 440 285 
= + 264 520 474 
~ - 36 226 264 
- none 4 99 84 
- arbitrary 304 845 822 
none none 8562 8715 8893 


on whether the jet is initiated by a u-, d- or s-quark. The differences in the hadron 
charge distributions can only be seen if one has a large statistical sample of jets of 
one particular quark type. It is interesting to see if a criterion exists that allows one 
to determine the flavor of the quark jet on an event-by-event basis. Clearly, any cri- 
terion for finite-momentum jets will not work perfectly, so in order to compare the 
usefulness of various criteria, we define a reliability by 


reliability = (Wy_ —Npe)/ Wr +e), (3.7) 


where V7 refers to the number of times the criterion correctly identifies the flavor 
of the jet from a sample containing an equal mixture of u- and d-quark jets and Vp 
is the number of times it is incorrect. If the criterion succeeds as often as it fails, 
the reliability is zero. The criterion may only apply to a sub-sample of the jets so 
we define an efficiency by 


efficiency = (Vy + NVf)/(total number of jets) . (3.8) 
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fable 13b 
Sane as table 13a except that only hadrons with z > 0.1 are observed 


by he u d s 
+ 1421 560 432 
. = 2006 1833 1610 
; none 2624 1205 1090 
arbitrary 6051 3598 3132 
+ 1618 1891 1978 
7 442 1060 1170 
none 936 2120 2272 
arbitrary 2996 5071 5420 
ene none 953 1331 1448 


Same as above but 2h, ? 0.5 


hy ho u d Ss 
+ 158 33 12 
- 253 225 140 
none 723 182 133 
arbitrary 1134 440 285 
+ 174 217 220 
- 14 98 111 
none 116 530 491 
arbitrary 304 845 822 

none none 8562 8715 8893 


| ur example, if we say that all jets where the fastest charged hadron is positive are 
u-quark jets and all jets where it is negative are d-quark jets, then for Py = 10 GeV 
the reliability is about 0.25 and the efficiency is 99% (only 1% of the jets contain 
no charged particles). One can increase the reliability by lowering efficiency. Using 
the same criterion but applying it only to those jets that have one charged particle 
with z > 0.5 yields a reliability of 0.45 but an efficiency of only 14% *. In table 15, 
we list other criteria together with the reliability and efficiency for an equal mixture 
of u- and d-quark jets with Py = 10 GeV. 


* One disadvantage of the criterion which only can be applied to a small fraction of jets (low 
efficiency) is, of course, lower counting rate. But a much more serious difficulty is the bias 
introduced by not knowing experimentally the difference between a u-jet and a d-jet in the 
fraction of times the criterion can be applied. We could, of course, calculate these efficiency 
differences for our “standard” jet but we would not know whether they were right experi- 
menially. The uncertainties are least if the efficiency is high. 
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Table 14 

Number of times various weighted quark charge Qw/(p) occured for 10 000 quark jets with 
energy Pg = 10 GeV, where Qw() = zn 2Pq; with q; and z; the charge and z value of the ith 
hadron in a quark jet with N charged hadrons. Results are given for p = 0.2 and 0.5 and o;ms is 
the root-mean-square deviation from the mean. 


Weighted charge u d s 
Cnet 7197 3499 3322 
ae Owirca 2737 6361 6529 
pe (Ow) 0.39 0.25 =0:27 
Bae 0.68 0.66 0.67 
Ow > 0 7172 3601 3320 
eine Ow <0 2762 6259 6531 
pee (Ow) 0.26 -0.15 ~0.18 
Oia 0.43 0.42 0.41 


Same as above but observe only hadrons withz > 0.1 


Ow > 0 6172 3482 3045 

cane Ow <0 2875 5187 5507 
(Ow) 0.34 0.19 ~0.26 
det 0.76 0.77 0.75 
Ow >0 6168 3486 3039 

pans Ow <0 2879 5183 5513 
(Ow) 0.24 -0.14 -0.18 
Ocans 0.52 0.53 0.51 


From table 15, one sees that none of the first fourteen criteria listed gives a very 
high reliability while still maintaining a reasonable efficiency. This is because the rank 
in hierarchy and the order in momentum are considerably mixed up as discussed 
earlier and witnessed by table 11. The highest reliability with a large efficiency is 
obtained by assigning jets with total charge Q > 0 as u-quark jets and jets with O<0 
as d-quark jets. The reliability is 0.45 with a 66% efficiency. Unfortunately, the total 
charge on a quark jet is not easily measured since it depends on including all the for- 
ward moving low-momentum hadrons which are difficult to measure without also 
including background from other sources. A quantity easy to measure experimentally 
is the total charge of all those hadrons in the jet which have z > 0.1. The reliability, 
however, now decreases to 0.38. If one wants to apply this criterion to all the events 
with Q = 0 jets assigned as d type, then the reliability decreases even further to 0.26 
(number 8 in table 15). 

The mean total charge of a large momentum u-jet is 0.60 and a d-jet is —0.40 
differing by 1.0. This would seem to permit a clear separation of these jets. However, 
even with a jet of extreme momentum, the total charge Q of all hadrons of P, > 0 
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Table 15 


Various criterion for determining the flavor of the quark initiating the jet for an equal mixture 


of Pg = 10 GeV u- and d-quark jets together with the reliability =(true—false)/(true+ false), 
where “‘true” refers to the number of times the criterion succeeded in selecting the correct 


flavor and ‘‘false”’ to the number of times the criterion failed. The efficiency is the number of 
jets (true+ false) to which the criterion can be applied divided by the total number of jets gen- 


erated. 
Criterion Reliability Efficiency 
1. fastest hadron positive = u-jet 0.27 60% 
fastest hadron negative = d-jet 
2. fastest charged hadron positive = u-jet 0.25 99% 
fastest charged hadron negative = d-jet 
3. charged hadron with z > 0.5 is positive = u-jet 0.45 14% 
charged hadron with z > 0.5 is negative > d-jet 
4. fastest two charged hadrons are both positive = u-jet 0.46 31% 
fastest two charged hadrons are both negative = d-jet 
5. total charge of jet 0 > 0 = u-jet 0.45 66% 
total charge of jet Q < 0 = d-jet 
6. same as 5 but only observe hadrons with z > 0.1 0.38 62% 
7. total charge of jet Q > 0 = u-jet 0.33 100% 
total charge of jet 0 <0 = det 
8. same as 7 but only observe those hadrons with z > 0.1 0.26 100% 
9. for Nop = 1 jets OQ > 0 = u-jet 0.68 4% 
for Nop = 1 jets Q < 0 > d-jet 
10. for Ncp = 3 jets OQ > 0 = u-jet 0.47 13% 
for Nch = 3 jets Q< 0 = djet 
11. for Ncp = odd jets 0 > 0 = u-jet 0.40 50% 
for Nop = odd jets Q < 0 = d-jet 
12. same as 9 but only observe hadrons with z > 0.1 0.38 34% 
13. same as 10 but only observe hadrons with z > 0.1 0.31 15% 
14. same as 11 but only observe hadrons with z > 0.1 0.36 50% 
15. “weighted” charge Ow(p = 0.2) > 0 = uet 0.37 99% 
“weighted” charge Qw(p = 0.2) < 0 = det 
(see eq. (3.9)) 
16. “‘weighted” charge Qw(p = 0.5) > 0 = ujet 0.36 99% 
“weighted” charge Ow(p = 0.5) < 0 = det 
17. same as 15 but only observe hadrons with z > 0.1 0.28 89% 
18. same as 16 but only observe hadrons with z > 0.1 0.28 89% 
19. “weighted” charge Qw(p = 0.2) > 0.4 = u-jet 0.51 63% 
“weighted” charge Ow (p = 0.2) < —0.3 => det 
20. “weighted” charge Qw(p = 0.5) > 0.4 = u-jet 0.58 46% 
“weighted” charge Ow(p = 0.5) < —0.3 = d-jet 
21. same as 19 but only observe hadrons with z > 0.1 0.38 62% 
22. same as 20 but only observe hadrons with z > 0.1 0.40 55% 
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must be an integer and thus a random variable. There is an unavoidable noise depend- 
ing on whether a particular charged particle in the plateau happens to have P, greater 
or less than zero. Even though the plateau is neutral and all the difference of u- and d- 
quark jets lies far away at higher z, one is trying to sum a long series like +i1—1+1+1—1 
+1—1—1... not knowing where to stop, but knowing only that +1 and —1 become 
more and more equally likely to occur as we go further down the series (to lower z). 
The proper thing to do is, of course, the analogue of Abel summation, weigh the 
terms with a gradually decreasing weight as we go down the series. If the weight falls 
gradually enough from unity at the beginning, the excess charge there will be accu- 
rately picked up. However, the random +1 far down where the weight has fallen 
toward zero will produce no fluctuations. That is, if particle i has “‘z rapidity” Y,, 
and charge q;, we form the “‘weighted” charge 


Ow(p)= x qi exp(—PY;,) = x 2Paqi» (3.9) 


where p is a small number. This quantity will have a mean (close to (Q) as p > 0) 
distinct for u- and d-quark jets. Furthermore, the “noise” or fluctuations expected 
from having to stop the sum below some finite z,,in is +27%,;, which can be made 
small as long as Z,,;n can be made small enough. 

For a given experimental circumstance, however, Zmin is fixed and the criteria 
that p be small and that z7,;, also be small are opposed. For sufficiently small zyin 
there is no problem, but because of the wide fluctuations in rapidity that the par- 
ticles in our model suffer, we have found that in practice the method does not work 
as well as we hoped. For z,,in = 0.1, with p = 0.5, for example, the fluctuating un- 
certainty z7,,, is 0.3 times less than the gross sum Q= Yq;; but such a large p means 
that Q(P) does not average as large as (Q). Even worse is that for such a large p the 
contributions of high-z particles depend so strongly on the precise z value they 
actually have. 

Figs. 20 and 21 show the distribution of Ow (@) with p = 0.2 and 0.5, respectively, 
for a u- and d-quark jet of energy Py = 10 GeV (including all hadrons with 
P, > 0). The p = 0.2 distributions are considerably broader than the p = 0.5 case; 
however, the former has mean values (Qw) that are more widely separated 
(Qw)y, — (Qwg = 0.64 for p = 0.2 and only 0.41 for p = 0.5). In both cases, there 
is a clear separation of the u- and d-jets. By the use of table 14, we find a reliability 
of 0.37 if we assign jets with Qy > 0 as u-quark type and those jets with Ow < 0 as 
d-quark type with p = 0.2. The efficiency of this criterion is excellent (99% since we 
include only those jets with at least one charged hadron). One can obtain a higher 
reliability (but lower efficiency) by excluding from consideration those jets with Ow 
values occurring in the overlap region of the u- and d-quark jet distributions. For 
example, table 15 shows that if we assign jets with Ow > 0.4 as u-type and those 
with Qw < —0.3 as d-type, then for p = 0.5 we get a 58% reliability with 46% effi- 
ciency. This “weighted” charge technique gives us better reliability factors than the 
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tig. 20. Predicted distribution of weighted quark charge Qw(p) for a u- (+) and d-quark (Xx) jet 
with Py = 10 GeV, where Qw(p) = 2321 Pai with z; and q; being the z value and charge of the 
th hadron and where the sum is over all hadrons in the jet with pz > 0. The value of the power 
pas taken to be 0.2. a = 0.77, Ops = ay = 0.5. d-quark, (Qw) = —0.25; u-quark, (Qw) = 0.39. 


other criterion in table 15. Nevertheless, even a reliability of 58% (about 4 out of 5 
correct) is not too impressive. 

In practice, for hadron-induced jets, it might be possible to go to lower Z min, 
perhaps even to p, somewhat negative, allowing a smaller p value. The many “‘back- 
ground” particles from the longitudinal jets may perhaps not contribute to much 
noise. But data from lepton jets should ultimately determine the best parameter p, 
Zmin and criterion to use. 

In view of the small reliability factors in table 15, we must conclude that our 
efforts to find a method of determining the quark flavor of a jet on an event-by- 
event basis have failed for jets of energy Py S 10 GeV. The criterion can be 
made more reliable only with much higher momentum jets. (The method of 
“weighted” charge becomes an exact separation at very high momenta.) Neverthe- 
less, table 15 is useful in summarizing the differences between u- and d-quark jets. 
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Fig. 21. Same as fig. 20 but where the power p is taken to be 0.5. d-quark, (Qw) = —0.15, 
u-quark, (Qw) = 0.26. 


4. Properties of the quark rapidity plateau 
4.1. Rapidity correlations 


4.1.1. Correlations between adjacent-rank mesons 

There are two sources of correlations in our model. Naturally, there is the corre- 
lation among secondary particles that are the decay products of the same primary 
meson. In addition, however, the primary mesons are not formed at random in 
rapidity. Primary mesons adjacent in rank are correlated in both flavor and rapidity 
since they each contain a quark (or antiquark) that came from the same qq pair. 
The two primary mesons of adjacent rank tend to occur near each other in rapidity, 
Y,, as shown in fig. 22. The mean |AY,| between mesons adjacent in rank is about 
1.8 units, where all the decay products of a particular primary meson are assigned 
the rank of that meson (see fig. 1). Fig. 22 also shows the distribution of |AY,| 
between mesons with the same rank (({AY,|) = 0.9). All flavor correlations in the 
quark jets occur between primary mesons of adjacent rank. The flavor of a meson 
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Fig. 22. The distribution of distances AY, between the hadrons coming from one primary and 
those coming from another primary next in rank x, (| Yz1) = 1.8. The distribution AY, 

among the secondary hadrons which come from the decay of a single primary ©, (| Y,1) = 0.9. 
@ = 0.77, aps = ay = 0.5. 


of rank r + 2 is independent of the flavor of the meson of rank r. Rank is not a 
physical observable; however, the correlations shown in fig. 22 do produce effects 
in certain physical observables which we now proceed to examine. 


4.1.2. Rapidity-gap distributions {16] 

Figs. 23 and 24 show the distribution of Y, gaps between all particles, charged 
particles, and positive particles (equals negative particles) before and after decay in 
the rapidity plateau *. Because two primary mesons of adjacent rank have net 
charge either +1 or zero, small AY, gaps occur more frequently between charge 


* In order to avoid biasing against large Y, gaps, the distributions in figs. 23, 24 and 25 are cal- 
culated by first ordering all particles in Y, with Y, “i? Y,, and then considering all Yz;. in 
the range 3.0 < Yz; < 6.0 but when forming the gap iength AY2;= (Yz,- Yoieyh Y2, i+! 
allowed the range 3.0 < Youn < 9.0, Then only gaps AY2;< 3. 0 are displayed. This eee 
requires a rapidity interval twice as large as the maximum gap one decides to plot. 
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Fig. 23. Predicted number of times various Y, gaps occur between all primary mesons, between 
charged primaries and between positive primaries (the same as for negative primaries) in the 
rapidity plateau of a u-quark jet before the primary mesons are allowed to decay.a = 0.77, 

Ops = ay = 0.5. 0 all, © charged, + positives, x negatives. 


mesons (due to +—) than between negative mesons. For large Y, gaps, the situation 
is reversed and there are more large Y, gaps between the negatives than between the 
charged mesons. (Note that a gap between negatives can contain positives or neutrals 
whereas a gap between charged particles can only contain neutrals.) 

The distribution of Y, gaps carrying a specific charge is shown in fig. 25, where 


k 
On=L qi- 40 (4.1) 


is the charge carried by the kth Y, gap and q; is the charge of the ith meson (ordered 
in Y,). The quantity qo is the charge of the quark that initiated the jet (u-quark in 
fig. 25). If the mesons were ordered in hierarchy, only the net charge, a 19, equal 
to one or zero could be obtained for a u-quark jet; Q, must be the charge of an anti- 
quark. For ordering in Y,, other values of D1 q; can occur, but only by mixing up 
hierarchy order and Y, order. Because of this, for a u-jet, gaps of charge 3 and 5 
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Fig. 24. The same as fig. 23 but after the vector mesons (and the n and n’) have decayed 
according to the particle tables. 


dominate at large gaps over the other gap charges *. (The distribution of @= —3 and 
Q= 5 gaps are equal, as are gaps of charge Q = 3 and Q= —$, etc.) 


4.1.3. Charged particle correlations 


In analogy with the statistical-mechanical description of density fluctuations in a 
liquid, one defines a correlation function (see, for example, ref. [17]) 


1 do 1 do do 
CY Set ee 4.2 
1. Y= 5 dY, d¥, o% dY, dY, ee 


The “correlation moment”’, fy, is related to it by 


fo =f AY; d¥y CY, ¥2)=(NW—1)) - WY (4.3) 


* Inan exchange picture like that discussed by Pirila, Thomas and Quigg in ref. [16], large 
gaps with charge —3 and 3 correspond to the (allowed) exchange of a u- and d-quark, respec- 
tively, whereas the other gap charges correspond to lower-lying (exotic-type) exchanges. Hence 
the former, with a higher intercept (4), is expected to dominate at large gaps as it does. 
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Fig. 25. Predicted number of times various charged Y, gaps occur for a u-quark jet (after decay) 
where the charge of the kth gap is given by Ox = Lj=1q; — qo» where q; is the charge of the ith 
hadron (ordered in Yz) and qg is the charge of the initiating quark (in this case gg = 3). a = 0.77, 
Ops = ay = 0.5.00 = -3,+ O=3,xQ=4,00= ~§,*0= -3. 


In addition, one can define a “normalized” correlation function by 


1 do do 
oe” AY, d¥>> 
In the absence of any dynamical correlations, the probability to observe in a single 
jet one hadron at rapidity Y, and a second hadron at rapidity Y,, together with 
anything else, would be equal to the product of the probabilities to find hadrons at 
rapidity Y, and Y, in different jets and (4.2) would vanish. Figs. 26 and 27 show 
the value of C(¥;,, Y,,) and R(Y;,, Y,,); respectively, for a negative hadron at 
Y,, = 4.0 and a positive (upper) and negative (lower) hadron at Y,, versus AY, 
=Y2,—Yzq- The region —2 < AY, < 2 is the plateau region and 2< AY, <4 is 
the “end of the quark” region as can be seen from fig. 8. Also shown are the analy- 
tic results for the case where no particles decay (in this case, the analytic method is 
exact). Resonance decay clearly plays an important role in correlations; however, in 


RW, Yo) = C1, ¥2) / (4.4) 
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C(Yz1.Yz2) 


Fig. 26. Predicted behavior of the two-particle correlation function C(Y21, Yz,) defined by (4.2) 
where Y,, = 4.0 and we have plotted versus AY, = Yz, — Yz,. Results are given for hy negative 
and hy positive (upper) and h; negative and h» negative (lower) and are generated using Monte 
Carlo (points). The dashed curves are the results for C(Y, v Y,,) before the primary mesons are 
allowed to decay. The dotted curve (to guide the eye) is 0.36 exp(—3(AY;)?). a= 0.77, aps = ay 
= 0.5, Yz, = (4.0, 4.4). 


our model there is some clustering of the primary mesons before decay. The final 
correlation function (after decay) C(Y,, Y2) for oppositely charged hadrons shows 
the characteristic short-range correlation behavior [16—19] roughly like 
exp(—(AY,)*/457) with a correlation length, 25, of about 1.4. The correlation 
between two negatives is, on the other hand, quite small. The “correlation moment” 
fy in (4.3) after decay is given in table 3 and for ++ and —— it is negative. 


4.1.4. Mass distribution 

The distribution of two-particle mass for u-quark jets of Pg = 10 GeV resulting 
from the model is shown in figs. 28 and 29. One can see the p® and K** in the a* a7 
and K*7° combinations (our resonances have zero mass width). If our estimate of 


819 


820 


58 R.D. Field, R.P. Feynman / A parameterization of the properties of quark jets 


R(Yz1.Yz2) 


R(Yz1,Yz2) 


AY7=Yz1—Yz2 


Fig. 27. Same as fig. 26 but for the “normalized” correlation function defined in (4.4). The 
dotted curve (to guide the eye) is 0.34 exp(-}(AYz)?). 


the resonance contributions is roughly correct, then one would expect to see such 
resonances in the jets produced in lepton- or hadron-initiated processes. 


4.2. Comparison with pp collisions 


Many of the features predicted by our model for the behavior of the quark rapi- 
dity plateau are similar to those observed for the low-p, rapidity plateau generated 
in proton-proton collisions. The rapidity gap distributions, +— charged correlations, 
mass distributions, and charged multiplicities are quite similar to that seen in pp 
interactions. Similarities between the plateau behavior in lepton and low-p, hadron 
experiments have often been noted [20]. It will be interesting to see experimentally 
if they are really very much alike and to study theoretically why this may (or may 
not) be so. 

In making this comparison, one must be careful of the contributions from diffrac- 
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Fig. 28. Plot of the two-particle mass of (a) m+a~, (b) Ktn®, and (c) a+? from a u-quark jet with 


Pg = 10 GeV from our jet model with a = 0.77, aps = ay = 0.5. (Note that our resonances have zero 
width.) 


tive events in pp collisions which have no counterpart in the quark jets. The compar- 
ison should properly be made only with the inelastic non-diffractive part of pp col- 
lisions (if that can actually be defined and separated out). One place where this dif- 
fractive component plays an important role is in the behavior of f, . In pp collisions, 
fz becomes positive as the negative particle multiplicity (N_) increases (i.e., as the 
energy of the collision is increased). On the other hand, our model for quark jets 
predicts a negative f; for Py S 500 GeV (see table 3) and recent data on vp inter- 
actions, where the model should apply, indicate that it is indeed negative [21]. For 
those pp annihilation events which contain no final baryons so diffraction plays no 
role fz is also negative. 
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Fig. 29. Same as fig. 28 but for (a) hth~, (b) htht, and (c) h7h~, whereh=a+K. 


4.3. Problems with the model 


There is, of course, the obvious problem that the model contains no protons or 
other baryons and is thus incomplete. They might be included in the same frame- 
work if we were to consider that the field which makes the quark-antiquark pairs 
could, from time to time, make pairs of diquarks (two quarks whose color is in a 3 
representation) and anti-diquarks. To implement this idea, however, requires invent- 
ing a few new functions and parameters which we have too little information at 
present to guess. Since baryons are probably not produced with great frequency in 
quark jets, neglecting them probably causes no serious modifications of the meson 
distributions. Of course, we cannot predict baryon distributions at all. 

In addition, we have not included quantum effects. We have dealt solely with 
probabilities and not amplitudes. It would be interesting to imagine the recursive 
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principles to apply to amplitudes instead of probabilities. Instead of (2.3), one 
would write 


V¥a V5 V0 Hi $(ni) (4.5) 


as the amplitude to find primary mesons with a set of flavors and momenta, where 
$(n) is a (possibly complex) amplitude whose absolute square is f(7). By adding 
amplitudes in the appropriate manner, allowance would then be made for the Bose 
nature of indistinguishable primary mesons. 

The most serious problem with the model is that it suffers from a defect in prin- 
ciple. As Casher, Kogut and Susskind have pointed out [10], the correct way to 
look at the development of these jets is from the center out, not from each end in 
toward the center. That is, as the original quarks’ rapidities separate in, say, an e*e~ 
experiment, the first new quark pairs are made at relatively small momenta, then 
the quarks leaving the new pair generate a new pair between them, etc. We have not 
been able to develop a simple ansatz for both flavor and momentum distributions in 
accord with this idea. Instead, our ansatz thinks of the first pair forming near one 
end of the momentum chain, and then further ones follow generally down the mo- 
mentum scale. This mechanism has been well refuted by Bjorken and Kogut and 
Susskind. Our first surprise from the “chain-decay” point of view is that the func- 
tion, zf(1 — z), in (2.21) giving the momentum distribution of the first primary 
meson must be so peaked toward Jow momentum, z, to agree with experiment. It 
is not clear how the leading quark manages to find itself so far down in momentum. 

We can also see that something is wrong in principle with the chain-decay ansatz 
by considering the nature of the plateau region for an energetic qq pair produced in 
an e*e™~ colliding beam experiment. We have argued that this plateau region could 
be analyzed by going far down in rapidity in the jet from the quark q; even So far 
down that the hadrons are, in fact, moving slightly backwards (i.e., in the direction 
of q). If this is true, then all properties of this region would have to be the same if 
analyzed (in reverse order) as a property of the q-jet. Measure rapidities in the c.m.s. 
so the quark q has rapidity Yo and the antiquark —YQ. We first ask in the q-system, 
what is the probability that three primary mesons adjacent in hierarchy are located 
at rapidities Y,, Y,,and Y3 (per dY,, d¥,, dY3). This function K(Yo9 — Y1, 

Yo — Y2, Yo — Y3) deep in the plateau region depends only on rapidity differences, 
and not on Yo (i-e., it equals a function L(Y, — Y,, Y3— Y,))- Now seen in the 
other direction as a q jet, the three primary mesons are in rank 3, 2, 1 and the rapi- 
dity distance from the end of q are Yo + Y3, Yo + Y2, and Yo + Y,. Calculating in 
this way, we have K(Y9 + Y3, Yo + Yo, Yo + Y1) and, therefore, L(¥3— Yo, 

Y,— Y)- Choosing Y, = 0, Y, =In £, and Y3 = —In 7 so that the three particles 

are in the & + p, ratio &, 1, 1/m (or seen in the reverse order E — p, ratio n, 1, 1/€), 
we should have 


L(-ln £, -In n) = L(-In n, —In£), (4.6) 
if the theory were truly symmetrical. 
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Fig. 30. illustration of the asymmetry in our model of functions R(E, n) describing correlations 
among three primaries adjacent in rank in the plateau (discussed in subsect. 4.3). The curves are 
R(1, £) (solid) and R(é, 1) (dashed) versus In §. They were calculated using f(n) = 3n?. Theoret- 
ically, R(€,n) should be a symmetric function. 


Direct calculation of the function L(—In &, —In n), which we write as EnR(E, 7) 
so R is the number of particles per dé dn, shows it is not symmetrical. The differ- 
ence between R(é, 7) and R(n, &) is not great, yet that there is any difference at all 
shows our physical view is not entirely sound. The difference is illustrated in fig. 30, 
where we choose 7 = 1 and compare R¢, 1) and R(1, £) calculated for the simple 
case f(n) = 3n?. The differences for other values of 7 are similar or not as large. 
Since rank in hierarchy is not physically observable, neither is R(é, 7). However, 
any amount of asymmetry of R(&, 7) will result in an observable (in principle) 
asymmetry in the rapidity plateau for three hadrons whose flavors are such that 
they could occur adjacent in rank (e.g., K*, K~, 1"). These asymmetries are quite 
small. None the less, they represent a defect of the model. 

We have found the “‘chain-decay” ansatz so simple to formulate and to analyze 
arithmetically and analytically that we wanted to study it as a possible approxima- 
tion by which we might get a rough idea of the general behavior of quark jets. It is 
important to have such a scheme to make some suggestions for the planning and 
design of “jet” experiments. It is delightful that so few assumptions will yield a jet 
model which is so complete in being able to describe a jet. (Only the transverse 
momentum correlation questions are unanswered.) 
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5. Transverse-momentum results 
5.1. Transverse-momentum distributions versus z, Y, and M2 


The mean distribution of transverse momentum depends in no way on our 
assumed correlations between primary mesons. None of the details of our recursive 
scheme are of much concern; just the distribution in z, F(z), of primary mesons and 
the assumption that all the primary mesons are distributed in transverse momentum 
in the same way (which we take to be a Gaussian as eq. (2.46) independent of z, 
flavor or spin). In our previous work (FFF), we assumed that all the final mesons 
(direct plus indirect) were distributed the same, independent of z. Our newer view, 
that it is the primary mesons which have the universal distribution in transverse mo- 
mentum, produces a number of interesting effects which we will now discuss. 

The transverse momentum of the final mesons has two components: the original 
momentum P, of the primary mesons plus the additional Q, received by the prod- 
ucts of those primary mesons that decay. Assuming that half of the primary mesons 
are vectors that decay and letting the 7 and 7’ also decay, we can easily calculate the 
mean transverse momentum expected for various particles in the jet. As discussed in 
subsect. 2.7.2, we find that using o = 350 MeV in (2.48) produces a mean transverse 
momentum of charged pions of (k,), = 323 MeV. The mean transverse momentum 
of the primaries (2.48) is then 439 MeV, considerably larger than that of the secon- 
daries *. 

The reason for this can be seen in the coordinate system boosted so that the z 
component of the primary meson is zero. It does have ap, , however. If it is of mass 
M and disintegrates into two mesons each of mass 1M (so that the Q value is zero), 
then these decay products have only kp - Thus disintegrations considerably reduce 
the effect of p,. The mean value of Q, generated by a spherical distribution of maxi- 
mum momentum Q if p, = 0 is 7Q, but the values of Q from the particle tables are 
not so large as 400 MeV on the average. Of course, there are mutual effects when 
both p, and Q are non-zero but the qualitative effect is clear; the large <p.) of the 
primaries is decreased by observing the mean (k,) of the final mesons. Even though 
some primaries do not decay and thus contribute directly to (k,), in averaging over 
all particles the indirect mesons (decay products) have more weight for they occur 
more frequently. In addition, because more of the decay products of the vector 
mesons (and 7 + 7’) are pions than kaons, the transverse-momentum distribution of 
kaons resembles more closely the distribution of primaries than does the pion distri- 
bution. The latter distribution is more sharply peaked at small k, , as shown in fig. 
31. The mean k, for final pions is 


(k,),+ = 323 MeV , (5.1a) 


whereas for final kaons, it is 


* This has been pointed out by Seiden in ref. [13]. 
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Fig. 31. Distribution of transverse momentum, dN/d2k, versus ky tor n* and K* resulting from 
a u-quark jet in our model with a = 0.77, aps = ay = 0.5 and o = 350 MeV. The dotted straight 
line is exp(—6k,). 


kiye = 384 MeV. (5.1b) 
Almost all the rho mesons are direct so that 
(kK), = 439 MeV, (5.1c) 


which is the same as the distribution of primary mesons. The mean k, of all the 
final particles is very close to that for pions since 75% of the final particles are 
pions. The exact shapes of the transverse-momentum distributions shown in fig. 31 
depend on our arbitrary assumption of a Gaussian distribution for the primaries 
(2.46). The tendency for higher-mass particles like the K or p to have a flatter 
distribution in kK, and a larger (k,) than the lighter 7 mesons is observed in the 
plateau region of ordinary pp collisions. In our model, the reason is not directly 
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1 32. The mean value of the transverse momentum of all hadrons, (ky), versus Yeng — Y; 
where Yenq is as in fig. 11. 
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Fig. 33. The mean value of the transverse momentum of all hadrons, (k1), versus z. Also shown 


by dashed lines are the total mean k, (summing over all z) of the primary mesons before decay 
(439 MeV) and after decay (323 MeV). 
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due to the K and p being more massive than the 7, but because a smaller fraction 
of them are due to resonance decays than for the z. 

How is the transverse momentum distributed in rapidity Y and in z? We see 
from fig. 32 that (k,) for particles of a given rapidity Y, in the plateau region, and 
well toward the forward direction where the number of particles begins to fall off, 
have the mean k, of 323 MeV as in (5-.1a). Only the few particles in the large-Y 
region (near Yenq) have (k,) smaller than 323 MeV. On the other hand, particles 
with a given z have a mean kK, which depends on z as shown in fig. 33. For z 20.1, 
(k,) is larger than 323 MeV and approaches the mean of the primary mesons, 439 
MeV, as z approaches one. 

This latter behavior is obvious since only primary mesons can obtain a z value 
very near one. However, the reason (k,) for particles with z in the range 0.2—0.6 
is so large requires further discussion. Consider a primary meson generated at some 
Zo (and true rapidity Yo), but look at it in the frame for which it has zero longitu- 
dinal momentum pz. First, suppose its original perpendicular momentum, pj, is 
zero and suppose it decays into two mesons. Then the secondaries acquire with 
equal likelihood both positive and negative z component of momentum Q,; that is, 
both positive and negative rapidity. Thus, back in the lab system (just add a con- 
stant Y), the secondaries are found symmetrically spread in rapidity about the 
original rapidity Yo of the parent. It is otherwise with z, the secondaries are widely 
spread in z but only over values below z9 (uniformly from 0 to Zo for massless 
products, for example). 

Now to see what happens if the parent meson has non-zero p,, imagine this p, 
to arise from a very small rotation of the original large p, (by an angle p/p). It is 
clear that if a product ends up at a momentum Pz, , its share of the parent p, is 
only (p,,/pz)P1, or fractionally z/zo of the original. That is, daughter particles 
carry a fraction z/z9 of the originally large parent perpendicular momentum. Those 
that obtain az near Zp have a larger share than those which obtain a smaller z. It is 
this effect (combined with the fact that higher-z particles are more likely themselves 
to be primaries) which means that the higher-z particles (z 2 0.1) have a mean ky 
considerably larger than the average. (It is true that decay products with the very 
highest z/zg have received so much Q, that the mean transverse decay momentum 
Q, must be smaller. But this depends on Q, (or 2/z9) via a curve with a vertical 
tangent, an ellipse in fact, and the effect is not great and is overwhelmed by the 
(z/Z9) Pp. effect.) 

The mean k, of all final particles is not much affected by the region z & 0.1 
because there are so many more particles of low z. Hence, the region z 2 0.1 can 
have a considerably higher (k,) than the (k,) of all particles. Far in the plateau of 
Y,, one ultimately gets the same low mean of all particles. But, where the curve of 
the number of particles versus Y, first begins to fall, the (k,) begins to rise rapidly 
since each decay particle is sent to larger Y,. For the true rapidity Y, however, 
decay sends particles equally up and down in Y so even a linear fall-off of the Y dis- 
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ly. 34. The mean value of the transverse momentum, (k 1), of two particles versus the two- 
particle mass M. The results are for a u-quark of energy Pg = 50 GeV. Also shown, by 
dashed lines, is \/2 times the (k,) of primaries (621 MeV) and \/2 times the (k,) of the final 
mesons (457 MeV). 


tribution would not distort the mean k, . More radical variations, as those near the 
high Yeng do, however. The very highest Y values come from particles having abnor- 
mally low my, so the mean k, here is lower (see fig. 32). 

Finally, for completeness, we show in fig. 34 the mean total k, of two particles 
whose mass M is given by 


M? = (Ey +E)? — 2, +P)” Kt (5.2a) 
with 
= '@y tps) Opt Py.) 3 (5.2b) 


where Ey and £ are the energies of the two particles, respectively. As a function of 
the mass M, the two-particle (k,) increases from a value of about 350 MeV at small 
mass M to a value equal to 4/2 times (k1)primary Or about 621 MeV for M greater 
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than about 5.0 GeV. This sharp rise of (k,) with mass M has also been observed in 
the plateau region of ordinary pp collisions [22]. 

We feel that the description of transverse momentum presented here is more 
satisfactory than the one used by us in FFF, in which all the final mesons in a quark 
jet had the same (k,)g., = 330 MeV independent of z. We must change this to read 
that all primary mesons have the same (k,) = 439 MeV independent of z. Because 
the large-p, hadron experiments which we analyzed in FFF are sensitive predomi- 
nantly to the large-z region of the quark decay functions so that (ky)g+n © 439 MeV, 
a number of our predictions in FFF will be changed *. For example, the mean values 
of Pout appearing in table 3 of FFF are all increased coming closer to the experimen- 
tal findings. Also predictions made for experiments with small aperture acceptance 
will have to be reduced because of the wider spread in angle of the large-z particles. 
We are now computing corrections to FFF based on our new jet model. The results 
will be discussed elsewhere. 


5.2. Correlations in transverse momentum 


Our assumptions in subsect. 2.7.2 that each produced quark-antiquark pair con- 
serves transverse momentum with no net k, and that the transverse momentum of 
a primary meson is the sum of the k, of two quarks leads to correlations in trans- 
verse momentum of the hadrons in the quark jet. As discussed in that subsection, 
primary mesons of adjacent rank tend to go oppositely with (k,, + k,,)= —o”. How- 
ever, as we have already learned from fig. 22, adjacent-rank mesons can find them- 
selves considerably spread apart in rapidity Y, so that these correlations are of long 
range. Fig. 35 shows the behavior of the asymmetry 


(1, Y2)= [Np (¥2) — Nu (¥2)]/[Nu (2) +N (2), (5.3) 


where My (Y2) and Np (Y2) are the number of hadrons at ¥2 with |¢12| <a and 
1¢42|> dn, respectively, and where ¢,> is the angle between kK, and k,,. This 


* After the completion of FFF, in which we assumed (k\)g—+h = 330 MeV independent of z, 
several people pointed out to us that there were indications that (k,) was, in fact, larger than 
330 MeV for large z both from hadron and lepton experiments. They further noted that we 
would improve our predictions of large-p, correlations in pp collisions if we took a larger 
(Kj)g—n: In particular, we would predict a larger P-out in better agreement with experiment. 

It is now clear from the analysis of Seiden. [13] and from data on yp collisions [21,23] 

that (Kgs iS considerably larger than 330 MeV at large z. We have previously been 
reluctant.to change (stubborn) because we did not want to “fiddle” our high, model in orde! 
to agree with the very experiments we were trying to predict. Now that we have added the phys 
ically natural assumption that some of the pions are secondary to resonances, we feel it is 
natural to replace the naive simple rule that the pions have a constant (with z) mean transverse 
momentum by the equally naive and simple rule that it is the primaries that have the constant 
transverse momenta. This, without further complication, leads to a natural explanation of many 
of the experimental observations. We shall adopt it hereafter. We are grateful to J. Vander Velde 
and Knud Hansen for discussions concerning this point. 
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Fig. 35. Predicted behavior of the asymmetry £(¥ 1, Y2) = (Np(Y2) ~ Nu(Y2))/(Nyu (12) + Np(¥2)) 
before (upper) and after (lower) decay of the primary mesons. The crosses (diamonds) are for 

the unlike (like) charge combinations and where Ny and Np are the number of hadrons at Y 

with lpia! < bm and 1¢12/ > 57, respectively, and $12 is the angle between the transverse mo- 
mentum vectors k, , and Ki... The results are for particle hy at Yz, = 4.0 and are plotted versus 


AY, = Yz, — ¥z5-¢= 0-77, aps = ay = 0.5, ¥z, = (4.0, 4.4). 


figure shows D(Y,, Y>) for like and unlike charge combinations both before and 
after decay where particle one is at Y, = 4.0 versus AY, = Yz,— ¥z2,- (With this 
definition, 2 is positive for ky, + k,, negative.) Before decay, we see no correla- 
tion between same-sign mesons (since they cannot occur adjacent in rank) anda 
strong wide-range correlation between oppositely charged hadrons. Resonance decay 
plays a large role in determining the behavior of 2. After decay, 2 is reduced for 
oppositely charged pairs and becomes non-zero and positive for same-sign hadrons. 
Similar transverse-momentum correlations have been observed in low-p, hadron- 
hadron collisions [24]. 
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6. Some applications in large-p, hadron-hadron collisions 
6.1. Large-p, particle ratios: comparison with FF1 


Fig. 36 shows particle ratios for @,,, = 90° pp collisions at large py predicted 
using the quark-scattering model presented in FF] but with our new quark-decay 
functions (analytic approximation). The quark-scattering model predicts that, 
except for the small smearing effects discussed in FFF, these ratios are functions 
only of x, = 2p,/\/s at fixed 6,,,. The new results are almost identical to those in 
FF1 except for the K*/K™ ratio which is now somewhat larger (a factor of 1.8 
atx, = 0.6) due in part to (3.3). The new model, however, also allows us to inves- 
tigate resonance production at large p,. In fig. 37 predictions for p® (equal to w®) 
K*°, K*° and ¢ production are presented. (The p°/7°® ratio was constructed to 
approach one at largex, by our assumption that a,, = a,.) In addition, we calcu- 


PARTICLE RATIOS Gem = 90° 


pp—=(K */K~) +X 


pp—=(w/7 ) +X 


pp—(K */1r*) +X 


pp—(n /1°) +X 


0.0 Or 02 03. O04 05 O68 OF 


Fig. 36. Particle ratios versus x, = 2p,/./s for ®cm = 90° pp collisions at large p, predicted from 
the quark-scattering model of FF1 but using our new quark-decay functions. 
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RESONANCE PRODUCTION = G6, = 90° 


pp~=(p°/1°) + x 


pp-e(K* Vir?) + X 


pp->(K*Y 19) +X 


fore) 
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ly 37 Predicted ratios of resonance to 19 production for cm = 90° pp collisions versus 
X, = 2py//s from the quark-scattering model of FF1 but using our new quark-decay functions. 


Table 16 

Comparison of large-p, particle ratios in pp collisions at @¢q, = 90° with the quark-scattering 
model of FF1 [4] using our new quark-decay functions, Dh(z). Also shown are the contribu- 
tions to the total m signal from p production and the contributions to the total K* signal from 


K*9 and ¢ production 


Ratio Experimental x)= 2p IVs Data Predictions 
group 

at fa C-P [28] 0.6 ~3.0 3.2 
K*+/nt C-P 0.55 ~0.5 0.46 
n/n CCRS [29] 0.15 =0.5 0.40 
K+/K7~ C-P 0.51 ~18 10.3 
p9/n0 R-412 [30] 0.2 ~1.0 0.7 
| 29 C-F [31] 0.2 <0.06 0.06 
p9> at /at R-410/13 [2,32] 0.1 =0.05 0.08 
pt — 79/70 R412 0.1-0.2 0.16 0.16 
K*0 | Kt+/K+ R410/13 0.1-0.2 =0.05 0.06 
o> Kt+/Kt R410/13 0.1-0.2 0.005 0.01 
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1.0 
RESONANCE CONTRIBUTIONS 6, =90° 
pp(V-~ 1 7r*) +X 
O.l 
pp (pt 1A 1) +X 
pp—(V->KYK*) +X 
pp (K*O-= K*/K*) +x 
0,01 
pp-=(p +K'/k*) +X 
0.001 
0.0 0. 0.2 03 0.4 0.5 0.6 07 


XL 


Fig. 38. Predicted contribution to the total large-p) meson signals for @¢, = 90° pp collisions 
from resonance decays. The symbol! V refers to the sum over all nine vector mesons and K*9 = 
K+/K*+ means the ratio of K*’s due to K*9 decay to the total K* signal, etc. 


late the fraction of the total large-p, signal that arises from the decay of various 
resonances. For example, fig. 38 shows that at ISR (x, = 0.1) 27% of the total a* 
and 11% of the total K* signal comes from resonance decays. As p, increases, the 
contributions from resonance decays decrease. Table 16 gives a comparison of the 
expected values of various ratios with existing experimental data. 


6.2. Same-side two-particle correlations: comparison with FFF ansatz 
A distinctive feature of the quark-scattering model of FF1 is that high-p, par- 


ticles in hadron-hadron collisions are not isolated but members of a cluster (jet) of 
particles representing the fragmentation of the quark. In FFF, we estimated the 
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number of particles to be found in the same direction as a large-p, trigger (“‘same- 
side” correlations) by assuming that the two-particle decay functions could be 
approximated in terms of the single-particle decay function by (FFF ansatz) 


Dit"2@,, 22)= Da @1) Da2 2/1 —z,)/Q —2), (6.1) 


where h, is the trigger hadron so that z; >> z. The flavor of the quark qz was 
determined in the following manner. For what we called the “unambiguous” case 
where h, = ca contains the quark q, say q = c, then qy was taken to have the flavor 
“a”, For example, if q isa u-quark and h, = 7° or hy = K*, then qy =d or qa =3, 
respectively. For this case, we find the FFF ansatz to agree reasonably well with the 
results of our new jet model (a= 0.77 aps = a = 0.5). 

On the other hand, we stated in FFF that (6.1) was not appropriate for the 
“ambiguous” case where h, does not contain the quark q as a valence quark. We 
find that indeed our new jet model disagrees quite substantially with the FFF 
ansatz for the “ambiguous” case. Since the predictions in FFF are dominated by 
the unambiguous case, we feel that they are reliable. 

However, we can now use the new jet model to estimate the number of K*’s 
produced in association with a large-p, K~ trigger in pp collisions, which we did not 
attempt in FFF because this is dominated by the ‘‘ambiguous”’ case, and to improve 
the other same side correlation estimates in FFF. Before this is done, however, we 
must also change our handling of the transverse momentum of the particles within 
the quark jet to agree with the results of subsect. 5.1. We expect to discuss this 
elsewhere. 


7. Summary and conclusions 


In summary, we have found that the recursive model gives a convenient and easy 
way to compute properties of a quark jet in terms of only a few parameters. Although 
the model does not include the possibility of baryon emission and cannot represent a 
true physical theory of jets; the resulting structure seems very reasonable and gener- 
ally consistent with observations available so far. We recommend it as a “‘standard” 
jet to help design experiments and to which experiments may be compared and con- 
trasted. 

Aside from the mean number of particles (versus z), to which the model is fitted 
anyway, the main interesting feature is the distribution of charge, pt (z)- Dj (2), 
or of any other property by which jets originating from u-quarks can be distinguished 
from those from d-quarks. We can describe this as the distribution of the hadron con- 
taining the original quark. In the model, this is widely distributed. It is far from true 
that in an average (unbiased) jet the hadron of largest momentum always contains the 
original quark (see table 11). For jets of limited momentum, individual d-quark jet 
events can often look just like u-quark events and a reliable way to distinguish them 
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event-by-event seems unavailable. They can, of course, be distinguished by their aver- 
age properties, But our model, just like experiment, finds large statistical variations 
from event to event so that good averages require many events. (This can be seen 
most clearly on many of the Monte Carlo plots. Even for 40 000 jets, there are un- 
comfortably wide fluctuations for many quantities of interest.) One of the first 
things to determine experimentally is whether Dh @) - Di @) is really as widely 
distributed (see figs. 5,8, and 12) as our model (or the previous parametrization of 
FF1) suggests. Preliminary experimental indications are that it is [15,21 ,23]. 

The hadrons observed in quark jets can be secondaries from the decay of higher 
resonances (we have included only the 1~ vector resonances). Although the model 
makes interesting special predictions for how these primaries may be correlated, we 
find the correlations are spread widely in rapidity, and are further considerably 
obscured by the correlations induced by resonance decay. This makes it hard to 
check the specifies of the recursive idea, but it helps us in our original purpose; to 
generate a standard jet that ought to look much like nature. 

From our experience, we think that different jet models which have the right 
mean distributions of the various hadrons and which include the production of 
resonances in a way roughly consistent with experiment (our precise choice a, = dp. 
= 0.5 may require later modification) will be very difficult to distinguish experi- 
mentally. The recursive scheme is as good as any for our purpose *, 

This is well illustrated by transverse-momentum distributions. Many interesting 
variations which have been observed (versus z or the two-particle mass M) in lepton 
and hadron experiments appear to be merely consequences of the fact that hadrons 
are often secondaries of primaries with much simpler properties. In fact, the prima- 
ries might all have a uniform transverse-momentum distribution. 

For these reasons, we think of our jet model, not as an interesting theory to be 
checked by experiment, but rather as a possibly reliable guide as to what general 
properties might be expected experimentally. In particular, it can assist in the pro- 
gram of comparing hadron highp, jets to lepton-generated jets. 

On the other hand, quark jets are often investigated not just to be used as a tool 
to investigate hadron collisions, but rather as a subject of interest in itself. How are 
quark jets actually generated? From what we have learned, we think it will tax the 
ingenuity of experimenters to see behind those properties which are overshadowed 
by the effects of resonance decay in order to study effects more intrinsically related 
to the process of jet formation. 

In this connection, the most fundamental experimental question is whether lep- 
ton-induced jets really have a quark origin at all. Even here we find difficulties. Prop- 
erties averaged over many jets are unconvincing, for a mean charge, say, of 3 can be 
achieved by averaging over events each of which is associated with an integral charge. 
The most promising method would seem to be the momentum weighted charge 
Ow (p) in (3.9). In principle, at least, for a jet of sufficiently large momentum, titis 


* For examples of other jet models see refs. [13,25-—27]. 
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is a characteristic of the object generating the jet and one which can be checked for 
each event separately. Unfortunately, in practice, it will be difficult to use for even 
10 GeV jets (see fig. 21), if the charge is spread as widely as in our model. The 
method improves, but only slowly, as the jet energy increases. 


We are grateful to C. Bromberg, G. Fox, and C. Quigg for numerous informative 
discussions. Also, we thank T. Ferbel, K. Hanson and J. Vander Velde for useful 
correspondence. 


Note added in proof 


It has been suggested to us that perhaps the simple form for f(n) in (2.21) that 
we have used results in a distribution of quark ¢ charge, pt (z)— Da (z), that is 
broader than actually implied by the fit to pe (z) + Dp (z) in fig. 3. For example, 
a form like f(n) = A(1 ~ n)" (1 — a + 3an?) (uormalized properly) would produce 
a quark charge that is isolated more toward the end of the quark (high z). Sucha 
form produces a zF(z) that tends to dip at small z, but this possibility cannot be 
ruled out since the data in fig. 3 cannot be used for z $ 0.3. Hence one can view 
our jet model as a bit pessimistic. It results in a widely distributed quark charge and 
an order in rapidity and hierarchy that is quite mixed up. We must ultimately resort 


to experiment to see if DY (z) - Di (z) is as widely distributed as our model sug- 
gests. 


References 


[1] D. Linglin, Observation of large-p, jets at the ISR and the problem of parton transverse 
momentum, Invited talk at 12th Rencontre de Moriond, March 1977. 
[2] R. Moller, Jets and quantum numbers in the high-p, hadronic reactions at the CERN ISR, 
Invited talk at 12th Rencontre de Moriond, March 1977. 
[3] C. Bromberg et al., Phys. Rev. Lett. 38 (1977) 1447; 
E. Malamud, Comparison of hadron jets produced by m~ and p beams on hydrogen and 
aluminum targets, Invited talk presented at 8th Int. Symp. on multiparticle dynamics, 
Kaysersberg, France, June 1977. 
[4] (FF1) R.D. Field and R.P. Feynman, Phys. Rev. D15 (1977) 2590. 
[5] (FFF) R.P. Feynman, R.D. Field and G.C. Fox, Nucl. Phys. B128 (1977)1. 
[6] A. Krzywicki and B. Petersson, Phys. Rev. D6 (1972) 924. 
[7] J. Finkelstein and R.D. Peccei, Phys. Rev. D6 (1972) 2606. 
[8] J.D. Bjorken, High-p; dynamics, Lecture given at SLAC Summer Institute on particle 
physics, November 1975. 
[9] S.D. Ellis, M. Jacob and P.V. Landshoff, Nucl. Phys. B108 (1976) 93. 
[10] A. Casher, J. Kogut and L. Susskind, Phys. Rev. Lett. 31 (1973) 792. 
[11] G.R. Farrar and J.L. Rosner, Phys. Rev. D9 (1974) 2226. 
[12] R.N. Cahn and E.W. Colglazier, Phys. Rev. D9 (1974) 2658. 
[13] A. Seiden, Phys. Lett. 68B (1977) 157; Hadron distributions in a quark-parton model, 
SLAC-PUB-1962, Nucl. Phys. B, submitted. 


837 


838 


16 R.D. Field, R.P. Feynman / A parameterization of the properties of quark jets 


[14] R.P. Feynman, Photon-hadron interactions (Benjamin, Reading, Mass., 1972). 

[15] J.C. Vander Velde, Berkeley-Fermilab-Hawaii-Michigan Collaboration, Neutrino-proton 
interactions in the 15-Foot Bubble Chamber and properties of hadron jets, Invited talk 
presented at 4th Int. Winter Meeting on fundamental physics, Salardu, Spain, 1976. 

[16] P. Pirila, G-H. Thomas and C. Quigg, Phys. Rev. D12 (1975) 92; 

C. Quigg, P. Pirila and G.H. Thomas, Phys. Rev. Lett. 34 (1974) 290; 
A. Krzywicki and D. Weingarten, Phys. Lett. 50B (1974) 265. 

[17] C. Quigg, What have we learned from high-energy experiments?, Invited talk presented at 
the 1973 APS/DPF Meeting in Berkeley, AIP Proc. No. 14, subseries no. 6. 

[18] E.L. Berger, Phys. Lett. 49B (1974) 369. 

[19] T. Ferbel, Recent results from bubble-chamber experiments at Fermilab, invited talk at 
the SLAC Summer Institute on particle physics, SLAC report No. 179 (1974). 

[20] S.J. Brodsky and J.F. Gunion, Hadronic fragmentation as a probe of the underlying dy- 
namics of hadron collisions, UCD-77-9, SLAC-PUB-1939, Phys. Rev., to be published. 

[21] M. Derrick et al., Properties of the hadronic system resulting from vp interactions, ANL- 
HEP-PR-77-39, Phys. Rev. D, submitted. 

[22] P. Stix et al., Transverse-momentum properties of multi-pion systems produced in high- 
energy collisions, Univ. of Rochester preprint UR611. 

[23] J.C. Vander Velde, Berkeley-Fermilab-Hawaii-Michigan Collaboration, Neutrino interac- 
tions in the 15-Foot Hydrogen Bubble Chamber and search for charmed particles, Invited 
talk presented at 12th Rencontre de Moriond, Flane, France 1977. 

[24] C. Bromberg et al., Phys. Rev. D10 (1974) 3100. 

[25] H. Fukuda and C. Iso, Prog. Theor, Phys, 57 (1977) 483. 

[26] O. Sawada, Quark-cascade model; hadronization of quarks after a hard collision, Tohoku 
University preprint TU/77/166, 1977. 

[27] U.P. Sukhatme, Quark jets: a quantitative description, University of Cambridge preprint, 
DAMTP 77/25. 

[28] H. Frisch, A review of high transverse momentum processes, Invited talk at APS meeting, 
Brookhaven, October 1976; 

D. Antreasyan et al., Chicago-Princeton preprint CP76-2. 

[29] F.W. Busser et al., Phys. Lett. 55B (1975) 232. 

[30] D. Darriulat et al., Nucl. Phys. B107 (1976) 429. 

[31] J.A. Appel et al., Phys. Rev. Lett. 35 (1975) 9 

[32] H. Béggild, Event structures in large-p; hadron-hadron collisions, Invited talk given at 8th 
Int. Symp. on multiparticle dynamics, Kaysersberg, France, 1977. 


839 


VI. Quantum Gravity 


Feynman worked seriously on the quantum theory of gravity for about a decade, beginning in 
the early 1950s.! In the introduction to paper [57] he wrote that his interest was “primarily 
in the relation of one part of nature to another,” rather than in explaining phenomena or 
fitting data. In working out (absurdly small) one-loop quantum radiative corrections, he 
hoped that he would not be criticized “for the fact that there is no possible, practical reason 
for making these calculations.” ? 

In fact, his work led to two sets of very useful results. The first, purely pedagogi- 
cal, is embodied in the Feynman Lectures on Gravitation (publication [123]). In those lec- 
tures, Feynman develops the quantum field theory of a neutral massless spin 2 particle (the 
graviton), emphasizing the special features that arise, in comparison to theories of spin 0 and 
spin 1 particles, as well as the complications that result for a zero-mass particle in trying 
to create a self-consistent theory. As in the case of spin 1, masslessness results in redun- 
dant degrees of freedom, since Lorentz invariance requires that a massless particle can spin 
only along or opposite to its direction of momentum (positive or negative chirality), while a 
massive spin 2 particle may take up five different orientations relative to any arbitrary quan- 
tization direction. Eliminating the unwanted degrees of freedom is achieved by imposing 
certain “gauge conditions,” which in the gravitational case brings about nonlinearity in the 
form of graviton-graviton interaction. Feynman shows that the classical limit of a properly 
gauged massless spin 2 theory is described by the Einstein gravitational field equations. 

The second result of Feynman’s labors on gravitation is also pedagogical in part, with 
Feynman himself being one of the main students that he had in mind. For besides the elim- 
ination of unwanted degrees of freedom via the gauge conditions, a proof of self-consistency 
requires the quantum field theory to predict finite physical effects and satisfy certain conser- 
vation laws. For an interaction as weak as gravity, quantum predictions should be obtainable 
by a perturbation method, i.e. by expanding amplitudes for physical processes in a power se- 
ries in the gravitational coupling constant (essentially the square root of Newton’s constant). 
Then one must show that the successive terms in the series are finite, using the method of 
renormalization as required.‘ 

The papers included here, namely [57], [85], and [86], address the question of renormal- 
izing quantum gravity. Feynman begins in [57] by considering the quantum field theory 
of a massless spin 2 field in the “tree diagram” approximation, i.e. with neglect of graphs 
containing one or more closed loops. This is the theory that leads in the classical limit to the 
Einstein equations. Next he considers the class of diagrams with only one closed loop and 
develops a method for renormalizing such graphs (or, rather, the set of such graphs corre- 
sponding to a physical process, since individual graphs are not, in general, gauge-invariant.) 


1Murray Gell-Mann recalled discussing the subject with Feynman in November 1954, and later suggesting to 
him that “he try the analogous problem in Yang-Mills theory.” See Murray Gell-Mann, “Dick Feynman — 
the guy in the office down the hall,” in Most of the Good Stuff, edited by L.M. Brown and J.S. Rigden (New 
York: American Institute of Physics, 1993), pp. 82-83. 

2Paper [57], p. 697. 

’This development is contained in publication [123]. A more complete description of Feynman’s program, 
including citation of historical precedents, is contained in the preface to [123], written by John Preskill and 
Kip S. Thorne. 

4See, e.g., Renormalization, edited by L.M. Brown (New York: Springer-Verlag, 1993). 
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His procedure involves two innovations. One is to show how loop diagrams can be written as 
the sum (integral) of tree diagrams. The second innovation is the use of a “ghost particle” 
(a fictitious vector particle) that must appear in a diagram added in the renormalization 
process in order to insure that the physical amplitude satisfies the principle of unitarity (i.e. 
conservation of probability). These two new procedures are described in paper [57], pub- 
lished in 1964, and are fully derived in Feynman’s two contributions in the 1972 Festschrift 
to John Wheeler’s sixtieth birthday. 

Feynman’s two innovations are probably even more important methodologically than as 
contributions to quantum gravity itself, insofar as they have been applied in treating the 
gauge theory of vector particles, the so-called Yang—Mills theory.> Feynman described how 
this came about: 


The algebraic complexity of the gravitational field equations is so great that it is not 
easy to do exploratory mathematical investigations and checks. Gell-Mann suggested 
to me that the Yang-Mills theory of vector particles with zero mass also is a non- 
linear theory with a gauge group and might show the same difficulties, and yet be 
easier to handle algebraically. This proved to be the case, and thereafter, all the work 
was done first with the Yang-Mills theory and then the corresponding expressions 
for gravitation were worked out. The connection is exceedingly close. Each difficulty 
and its resolution in one theory has its corresponding difficulty and resolution in the 
other. It becomes obvious that to find a completely satisfactory quantization of the 
zero-mass Yang-Mills field, is to find a completely satisfactory quantization of 
the general theory of relativity.® 


The last sentence is probably an exaggeration, because while a satisfactory renormalizable 
theory of the Yang-Mills field does now exist, the same is not true for quantum gravity. It 
is also somewhat ironic that the Yang—Mills theory, regarded as a model theory for learning 
about how to quantize gravity, has achieved a status as a physical theory on the same level 
as general relativity itself. That is because the current Standard Model of strong, weak, and 
electromagnetic elementary particle interactions consists of leptons and quarks exchanging 
Yang-Mills fields, and full renormalizabilty has been proven. While Feynman did not succeed 
in going beyond the level of one-loop diagrams, his methods, including “ghost-particle” loops 
and path integrals, were the indispensable keys to the eventual success of this endeavor. 
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QUANTUM THEORY OF GRAVITATION* 
By R. P. Feynman 
(Received July 3, 1963) 


My subject is the quantum theory of gravitation. My interest in it is primarily in the 
relation of one part of nature to another. There’s a certain irrationality to any work in gravi- 
tation, so it’s hard to explain why you do any of it; for example, as far as quantum effects 
are concerned let us consider the effect of the gravitational attraction between an electron 
and a proton in a hydrogen atom; it changes the energy a little bit. Changing the energy 
of a quantum system means that the phase of the wave function is slowly shifted relative 
to what it would have been were no perturbation present. The effect of gravitation on the 
hydrogen atom is to shift the phase by 43 seconds of phase in every hundred times the 
lifetime of the universe! An atom made purely by gravitation, let us say two neutrons held 
together by gravitation, has a Bohr orbit of 108 light years. The energy of this system is 
10-° rydbergs. I wish to discuss here the possibility of calculating the Lamb correction to 
this thing, an energy, of the order 10-19. This irrationality is shown also in the strange 
gadgets of Prof. Weber, in the absurd creations of Prof. Wheeler and other such things, 
because the dimensions are so peculiar. It is therefore clear that the problem we are working 
on is not the correct problem; the correct problem is what determines the size of gravita- 
tion? But since I am among equally irrational men I won’t be criticized I hope for the fact 
that there is no possible, practical reason for making these calculations. 

I am limiting myself to not discussing the questions of quantum geometry nor what 
happens when the fields are of very short wave length. I am not trying to discuss any prob- 
lems which we don’t already have in present quantum field theory of other fields, not that 
I believe that gravitation is incapable of solving the problems that we have in the present 
theory, but because I wish to limit my subject. I suppose that no wave lengths are shorter 
than one-millionth of the Compton wave length of a proton, and therefore it is legitimate to 
analyze everything in perturbation approximation; and I will carry out the perturbation 
approximation as far as I can in every direction, so that we can have as many terms as we 
want, which means that we can go to ten to the minus two-hundred and something ryd- 
bergs. 

I am investigating this subject despite the real difficulty that there are no experiments. 
Therefore there is so real challenge to compute true, physical situations. And so I made 


* Based on a tape-recording of Professor Feynman’s lecture at the Conference on Relativistic Theories 
of Gravitation, Jabtonna, July, 1962. — Ed. 
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believe that there were experiments; I imagined that there were a lot of experiments and 
that the gravitational constant was more like the electrical constant and that they were coming 
up with data on the various gravitating atoms, and so forth; and that it was a challenge to 
calculate whether the theory agreed with the data. So that in each case I gave myself a specific 
physical problem; not a question, what happens in a quantized geometry, how do you define 
an energy tensor etc., unless that question was necessary to the solution of the physical 
problem, so please appreciate that the plan of the attack is a succession of increasingly 
complex physical problems; if I could do one, then I was finished, and I went to a harder 
one imagining the experimenters were getting into more and more complicated situations. 
Also I decided not to investigate what I would call familiar difficulties. The quantum electro- 
dynamics diverges; if this theory diverges, it’s not something to be investigated unless it 
produces any specific difficulties associated with gravitation. In short, I was looking entirely 
for unfamiliar (that is, unfamiliar to meson physics) difficulties. For example, it’s imme- 
diately remarked that the theory is non-linear. This is not at all an unfamiliar difficulty; 
the theory, for example, of the spin 1/2 particles interacting with the electromagnetic field 
has a coupling term pAy which involves three fields and is therefore non-linear; that’s 
not a new thing at all. Now, I thought that this would be very easy and I’d just go ahead 
and do it, and here’s what I planned. I started with the Lagrangian of Einstein for the inter- 
acting field of gravity and I had to make some definition for the matter since I’m dealing 
with real bodies and make up my mind what the matter was made of; and then later I would 
check whether the resuits that I have depend on the specific choice or they are more powerful. 
I can only do one example at a time; I took spin zero matter; then, since I’m going to make 
a perturbation theory, just as we do in quantum electrodynamics, where it is allowed (it is 
especially more allowed in gravity where the coupling constant is smaller), g,, *s written 
as flat space as if there were no gravity plus x times h,,, where x is the square root of the 
gravitational constant. Then, if this is substituted in the Lagrangian, one gets a big raess, 
which is outlined here. 


1 = 1 es es 
oad | RVedtd | We ereap.—mtVe 9) ae 0) 


Sr = Ou +h yy 


Substituting and expanding, and simplifying the results by a notation (a bar over a tensor 
means 


= 1 
Xp = 3 (Xpv +X — Op» Xoo) 3 


notice that if x,, is symmetric, x,, = %,») we get 


= > os 1 
L= fv hur, —2hy0,0hya,0) + a f (e—me gp?) dt+ 


1 
+ x | (iw Q,»—m? 5 hoo '] +H [cine +x? [-i09” + ete. (2) 


First, there are terms which are quadratic in A; then there are terms which are quadratic 
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in g, the spin zero meson field variable; then there are terms which are more complicated 
than quadratic; for example, here is a term with two q’s and one h, which I will write hop 
(I have written that one out, in particular); there are terms with three h’s; then there are 
terms which involve two /’s and two g’s; and so on and so on with more and mote compli- 
cated terms. The first two terms are considered as the free Lagrangian of the gravitational 
field and of the matter. 


Now we look first at what we would want to solve problem classically, we take the 
variation of this with respect to h, from the first term we produce a certain combination of 
second derivatives, and on the other side a mess involving higher orders than first. And the 
same with the g, of course. 


Tay. deen Pe ee = i (A, Q) (3) 
P,c0—m™ p = 7(9; h). (4) 


We will speak in the following way: (3) is a wave equation, of which S,, is the source, just 
like (4) is the wave equation of which x is the source. The problem is to solve those equa- 
tions in succession, and to use the usual methods of calculation of the quantum theory. 
Inasmuch as I wanted to get into the minimum of difficulties, I just took a guess that I use 
the same plan as I do in electricity; and the plan in electricity leads to the following sug- 
gestion here: that if you have a source, you divide by the operator on the left side of (3) 
in momentum space to get the propagator field. So I have to solve this equation (3). But 
as you all know it is singular; the entire Lagrangian in the beginning was invariant under 
a complicated transformation of g, which in the form of / is the following; if you add to A 
a gradient plus more, the entire system is invariant: 


hi» = Any + 26, yr 2g ba yt &, hav, a? (5) 


where ¢, is arbitrary, and w and » should be made symmetric in all these equations. As 
a consequence of this same invariance in the complet Lagrangian one can show that the 
source S,, must have zero divergence S,,,, = 0. In fact equations (3) would not be consistent 
without this condition as can be seen by barring both sides and taking the divergence — the 
left side vanishes identically. Now, because of the invariance of the equations, in the same 
way that the Maxwell equations cannot be solved to get a unique vector potential — so 
these can’t be solved and we can’t get a unique propagator. But because of the invariance 
under the transformation some arbitrary choice of a condition on A,, can be made, analogous 
to the Lorentz condition A, ,=0 in quantum electrodynamics. Making the simplest choice 
which I know, I make choice Figo = 0. This is four conditions and I have free the four 
variables €, that I can adjust to make the condition satisfied by igs Then this equation (3) 
is very simple, because two terms in (3) fall away and all we have is that the d’Alemberian 
of h is equal to S. Therefore the generating field from a source S,, will equal the S,, times 
1/k? in Fourier series, where k? is the square of the frequency, wave vector; the time part 
might be called the frequency w, the space part k. This is the analogue of the equation in 
electricity that says that the field is 1/4? times the current. In the method of quantum field 
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theory, you have a source which generates soinething, and that may interact later with some- 
thing else; the iteraction, of course, is S,, h,,3 80 that, I say, one source may create a potential 
which acts on another source. So, to take the very simplest example of two interacting Sys- 
tems, let’s say S and S’, the result would be the following: h would be generated by § 

. . » . . ad 
and then it would interact with Si so we would get for the inteiaction of two systems, of 
two particles, the fundamental interaction that we investigate 


=. 1 
Sie B Si (6) 


This represents the law of gravitational interaction expressed by means of an interchange 
of a virtual graviton. To understand the theory better and to see how far we already arrived 
we expand it out in components. Let index 4 represent the time, and 3 the direction of k, 
so that 1 and 2 are transverse. The condition k,,S,, = 0 becomes wS,, = kS3, where k is 
the magnitude of k. Using this, many of the terms involving number 3 component of S can 
be replaced by terms in number 4 components. After some rearranging there results 


agrees ] 1 v7 1 La , , 
—2S yu» RB Sur = 3 {S44 S44] + 2 {Saq(Si1 + S22) + Saa(Sy) + Soe) + 


, n rs 1 ra , ¢ 
+515 San — 4541 Sat — 4549 Sad] + gp — ze [Str —Seza) (Sti S22) + 45y9Sid]. (7) 


There is a singular point in the last term when w = k, and to be precise we put in the + ie 
as is well-known from electrodynamics. You note that in the first two terms instead of one 
over a four-dimensional w?—k* we have here just 1/k?, the momentum itself. Sy, is the 
energy density, so this first term represents the two energy densities interacting with no w 
dependence which means, in the Fourier transform an interaction instantaneous in time; 
and 1/k? means 1/r in space, so there’s an instantaneous l/r interaction between masses, 
Newton’s law. In the next term there’s another instantaneous term which says that New- 
ton’s mass law should be corrected by some other components analogous to a kind, of magnetic 
interavtion (not quite analogous because the magnetic interaction in electricity already 
involves a k*7—w?+ie propagator rather than just k*?. But the k?—w?+i¢ in gravitation 
comes even later and is a much smalle1 term which involves velocities to the fourth). So 
if we really wanted to do problems with atoms that were held together gravitationally it 
would be very easy ; we would take the first term, and possibly even the second as the inter- 
action. Being instantaneous, it can be put directly into a Schrédinger equation, analogous 
to the e2/r term for electrical interaction. And that take care of gravitation to a very high 
accuracy, without a quantized field theory at all. However, for still higher accuracy we 
have to do the radiative corrections, which come from the last term. 

Radiation of free gravitons corresponds to the situation that there is a pole in the propa- 
gator. There is a pole in the last term when w = k, of course, which means that the wave 
number and the frequency are related as for a mass zero particle. The residue of the pole, 
we see, is the product of two terms; which means that there are two kinds of waves, one 
generated, by S,;—Sgg and the other generated by S,,, and so we have two kinds of trans- 
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verse polarized waves, that is there are two polarization states for the graviton. The linear 
combination S,;— Sg2+ 275, vary with angle O of rotation in the 1—2 plane as e*?® so the 
gravitaton has spin 2, component +2 along direction of polarization. Everything is clear 
directly from the expression (7) ; I just wanted to illustrate that the propagator (6) of quantum 
mechanics and all that we know about the classical situation are in evident coincidence. 
In order to proceed to make specific calculations by means of diagrams, beside the 
propagator we need to know just what the junctions are, in other words just what the S’s 
are for a particular problem; and I shall just illustrate how that’s done in one example. 
It is done by looking at the non-quadratic terms in the Lagrangian I’ve written one out 
completely. This one has an / and two g’s in the Lagrangian (2). The rules of the quantum 
mechanics for writting this thing are to look at the h and two g’s: one » each refers to the 
in and out particle, and the one A corresponds to the graviton; so we immediately see in 
that term a two particle interaction through a graviton (see Fig. 1). And we can immediately 


P2 


Ps 


Fig. 1 


read off the answer for the interaction this way: if the py and p, are the momenta of the 
particles and q the momentum of the graviton; and e,, is the polarization tensor of the 
plane wave representing the graviton, that is has=e,, e”’*, the Fourier expansion of this 
term gives the amplitude for the coupling of two particles to a graviton 


= 1 
Pi Pe Cu — oO mega (8) 


So this is a coupling of matter to gravity; it is first order, and then there are higher terms; 
but the point I’m trying to make is that there is no mystery about what to write down — 
everything is perfectly clear, from the Lagrangian. We have the propagator, we have the 
couplings, we can write everything. A term like hhh implies a definite formula for the 
interaction of three gravitons; it is very complicated, and I won’t write it down, but you 
can read it right off directly by substituting momenta for the gradients. That such a term 
exists is, of course, natural, because gravity interacts with any kind of energy, including 
its own, so if it interacts with an object-particles it will interact with gravitons; so this is 
the scattering of a graviton in a gravitational field, which must exist. So that everything 
is directly readable and all we have to do is proceed to find out if we get a sensible physics. 
I’ve already indicated that the physics of direct interactions is sensible; and I go ahead 
now to compute a number of other things. 

To take just one example, we compute the Compton effect, or the analogue rather, 
of the Compton effect, in which a graviton comes in and out on a particle. The amplitude 
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for this is a sum of terms corresponding to the diagrams of Fig. 2. The amplidute for 
the first diagram of Fig. 2 is the coupling (8) times the propagator for the intermediate 
meson which reads (p?—m?)-, which is the Fourier transform of the equation (4) which 
is the propagation of the spin zero particle. Then there is another coupling of the same 
form as (8). We multiply these together, to get the amplitude for that diagram 


2 1 1 1 
2 b la es 
(ote. Cw —— iD) Cup nt) p?—m? (. Pr €or — >) san] > 


where we should substitute p = p?+q° = p!+q%. Then you must add similar contributions 
from the other diagrams. 
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Fig. 2 


The third one comes in because there are terms with two h’s and two g’s in the Lagrangian. 
One adds the four diagrams together and gets an answer for the Compton effect. It is rather 
simple, and quite interesting; that it is simple is what is interesting, because the labour is 
fantastic in all these things. 

But the thing I would like to emphasize is this; in this problem we used a certain wave 
eg for the incoming graviton number ”’a” say; the question is could we use a different one? 
According to the theory, it should really be invariant under coordinate transformations 
and so on, but what it corresponds to here is the analogue of gauge invariance, that you can 
add to the potential a gradient (see (5)). And therefore it should be that if I changed e,, of 
a particular graviton to e,,+9,&, where € is arbitrary, and q, is the momentum of the gravi- 
ton, there should be no change in the physics. In short, the amplitude should be unchanged; 
and it is. The amplitude for this particular process is what I call gauge-invariant, or coor- 
dinate-transforming invariant. At first sight this is somewhat puzzling, because you would 
have expected that the invariance law of the whole thing is more complicated, including 
the last two terms in (5), which I seem to have omitted. But those terms have been included: 
you see asymptotically all you have to do is worry about the second term, the last two in 
h’s times é’s are in fact generated by the last diagram, Fig. 2D; when I put a gradient in here 
for this one, what this means is if I put for the incoming wave a pure gradient, I should get 
zero. If | put the gradient ¢,& in for eg, on this term D,'I get a coupling between 
and the other field e%, because of the three graviton coupling. The result, as far as the 
matter line is concerned is that it is acted on in first order by a resultant field &, & q7 + 


4 3 2 Cw €q Which is just the last two terms in (5). The rule is that the field which acts on the 
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matter itself must be invariant the way described by (5); but here in Fig. 2 I’ve already cal- 
culated all the corrections, the generator and all the necessary non-linear modifications if I 
take all the diagrams into account. In short, asymptotically far away if I include all kinds 
of diagrams such as D, the invariance need be checked only for a pure gradient added to 
an incoming wave. It takes care of the non-linearities by calculating them through the in- 
teraction. 

I would like, now, to emphasize one more point that is very important for our later 
discussion. If I add a gradient, I said, the result was zero. Let’s call a the one graviton coming 
in and 6 the other one in every diagram. The result is zero if I use a gradient for a, only 
if b is a free graviton with no source; that is if it is either really an honest graviton 
with (q’)? = 0, or a pure potential, which is a solution of the free wave equation. That is 
unlike electrodynamics, where the field b could have been any potential at all and adding 
a gradient to a would have made no difference. But in gravity, it must be that b is a pure 
wave; the reason is very simple. There is no way to avoid this by changing any propagators; 
this is not a disease — there is a physical reason. The reason can be seen as follows: If this 
b had a source let me modify my diagrams to show the source of b, suppose some other 
matter particle made the b, so we add onto each 6 line a matter line at the end, like Fig. 3a. 
(E.g. Fig. 2a becomes Fig. 3b etc.) 


o cae SOULCE of b 


ae Source of 5 a 


Fig. 3 


Now, if 5 isn’t a free wave, but it had a source, the situation is this. If this ’’a”’ field is taken 
as a gradient field which operates everywhere on everything in the diagram 
it should give zero. But we forgot something; there’s another type of diagram, if the a” 
is supposed to act on everything, one of which looks like Fig. 3c, in which the ”a”’ itself 
acts on the source of b and then b comes over to interact with the original matter. In other 
words, among all the diagrams where there is a source, there’s also these of type 3c. The 
sum of all diagrams is zero; but the sum of those like Fig. 2 without those of type 3c is not 
zero, and therefore if I were to just calculate the diagrams of Fig. 2 and forget about the 
source of b and then put a gradient in for ”a@” the result cannot be zero, but must be get- 
ting ready to cancel the terms from the likes of 3c when I do it right. That will turn out 
to be an important point to emphasize. I have done a lot of problems like this, without 
closed loops but I won’t bore you with all the problems and answers; there’s nothing new, 
I mean nothing interesting, in the sense that no apparent difficulties arise. 

However, the next step is to take situations in which we have what we call closed loops, 
or rings, or circuits, in which not all momenta of the problem are defined. Let me just men- 
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tion something. I’ve analyzed this method both by doing a number of problems, and by 
a mathematical high-class elegant technique — I can do high class mathematics too, but 
I don’t believe in it, that’s the difference. I have to check it in a problem. I can prove that 
no matter how complicated the problem is, if you take it in the order in which there are 
no rings, in which every momentum is determined, the invariance is satisfied, the system 
is independent of what choice I made of gauge and of the propagator I made in the begin- 
ning; and everything is all right, there are no difficulties. I emphasize that this contains all 
the classical cases, and so I’m really saying there are no difficulties in the classical gravita- 
tion theory. This is not meant as a grand discovery, because after all, you’ve been worrying 
about all these difficulties that I say don’t exist, but only for you to get an idea of the cali- 
bration — what I mean by difficulties! If we take the next case, let’s say the interaction of 
two particles in a higher order, then you get diagrams of which I’ll only begin to write 
a few of them. One that looks like this in which two gravitons are exchanged, 
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Fig. 4 


or, for instance, a graviton gets split into two gravitons and then come back — these are 
only the beginning of a whole series of frightening-looking pictures, which correspond to 
the problem of calculating the Lamb shift, or the radiative corrections to the hydrogen 
atom. When I tried to do this, I did it ina straightforward way, following all the rules, putting 
in the propagator 1/k*, and so on. I had some difficulties, the thing didn’t look gauge in- 
variant but that had to do with the way I was making the cutoffs, because the stuff is infinite. 
Shortage of time does’nt permit me to explain the way I got around all those things, because 
in spite of getting around all those things the result is nevertheless definitely incorrect. 
It’s gauge-invariant, it’s perfectly O.K. looking, but it is definitely incorrect. The reason 
I knew it was incorrect is the following. In order to get it gauge-invariant, I had to do a lot 
of pushing and pulling, and I got the feeling that the thing might not be unique. I figured 
that maybe somebody else could do it another way or something, and I was rather suspicious, 
so I tried to get more tests for it; and a student of mine, by the name of Yura, tested to see 
if it was unitary; and what that means is the following: Let me take instead of this scattering 
problem, a problem of Fig. 4 in which time runs vertically, a problem which gives the same 
diagrams but in which time is running horizontally, which is the annihilation of a pair, to 
produce another pair, and we are calculating second order corrections to that problem. 
Let’s suppose for simplicity that in the final state the pair is in the same state as before. 
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Then, adding all these diagrams gives the amplitude that if you have a pair, particle and 
antiparticle, they annihilate and recreate themselves; in other words it’s the amplitude 
that the pair is still in the same state as a function of time. The amplitude to remain in the 
same state for a time J in general is of the form 


vis) T 


v 
you see that the imaginary part of the phase goes as e 2; which means that the probability 
of being in a state must decrease with time. Why does the probability decrease in time? 
Because there’s another possibility, namely, these two objects could come together, annihilate, 
and produce a real pair of gravitons. Therefore, it is necessary that this decay rate of the 
closed loop diagrams in Fig. 4 that I obtain by directly finding the imaginary part of the sum 
agrees with another thing I can calculate independently, without looking at the closed loop 
diagrams. Namely, what is the rate at which a particle and antiparticle annihilate into two 
gravitons? And this is very easy to calculate (same set of diagrams as Fig. 2, only turned on 
its side). I calculated this rate from Fig. 2, checked whether this rate agrees with the rate 
at which the probability of the two particles staying the same decreases (imaginary part of 
Fig. 4), and it does not check. Somethin’gs the matter. 

This made me investigate the entire subject in great detail to find out what the trouble 
is. I discovered in the process two things. First, I discovered a number of theorems, which as 
far as I know are new, which relate closed loop diagrams and diagrams without closed loop 
diagrams (I shall call the latter diagrams trees”). The unitarity relation which I have just 
been describing, is one connection between a closed loop diagram and a tree; but I found 
a whole lot of other ones, and this gives me more tests on my machinery. So let me just tell 
youa little bit about this theorem, which gives other rules. It is rather interesting. As a matter 
of fact, I proved that if you have a diagram with rings in it there are enough theorems 
altogether, so that you can express any diagram with circuits completely in terms of diagrams 
with trees and with all momenta for tree diagrams in physically attainable regions and on 
the mass shell. The demonstration is remarkably easy. There are several ways of demonstra- 
ting it; Pll only chose one. Things propagate from one place to another, as I said, with 
amplitude 1/k?. When translated into space, that’s a certain propagation function which 
you might call K,(1, 2), a function of two positions, 1, 2, in space-time. It represents, in the 
past, incoming waves and in the future, it represents outgoing waves; so you have 
waves come in and out; and that’s the conventional propagator, with the ie and so on, 
as usually represented. However, this is only a solution of the propagators’s equation, 
the wave equation I mean; it is a special solution, as you all know. There are other solutions; 
for instance there is a solution which is purely retarded, which I'll call K,,, and which exists 
only inside the future light-cone. Now, if you have two Green’s functions for the same 
equation they must differ by some solution of the homogeneous equation, say K,. That 
means K, is a solution of the free wave equation and Ky = K,,,+K,. In a ring like Fig. 4a 
we have a whole product of these K,’s. For example, for four points 1, 2,3, 4 in a ring 
we have a product like this: Ky(1, 2)K4(2, 3)K,(3, 4)K,(4, 1) (all K’s are not the same, 
soine of them belong to the gravitons and some are propagators for the particles and so on). 
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But now let us see what happens if we were to replace one (or more) of these K, by K,, 
say K,(1, 2) is K,(1, 2)? Then between 1, 2 we have just free particles, you’ve broken the 
ring; you’ve got an open diagram, because K, is free wave solution, and this means it’s 
an integral over all real momenta of free particles, on the mass shell and perfectly honest, 
Therefore if we replace one of K, by K, then that particular line is opened; and the process 
is changed to one in which there is a forward scattering of an extra particle; there’s a fake 
particle that belongs to this propagator that has to be integrated over, but it’s a free diagram — 
it is now a tree, and therefore perfectly definite and unique to calculate. But I said that I 


could open every diagram; the reason is this. First I note that if I put K,,, for every K in 
a ring, I get zero 


Kye(1, 2)Kyer(2, 3) Kya(3, 4) Kyee(4, 1) = 0 (9) 


for to be non zero ¢,; must be greater than fg, tg > ty, tg >> t, and t, > t, which is impossible. 
Now make the substitution K,,, = K,—K,, in (9). You get either all Ky, in each factor, 
which is the closed loop we want; or at least one K,,, which are represented by tree diagrams. 
Since the sum is zero, closed loops can be represented, as integrals over tree diagrams. I was 
surprised I had never noticed this thing before. 

Well, then I checked whether these diagrams of Fig. 4 when opened into trees agreed 
with the theorem. I mean I hoped that the theorem proved for other meson theories would 
agree in principle for the gravity case, such that on opening a virtual graviton line the 
tree would correspond to forward scattering of free graviton waves. And it does not work 
in the gravity case. But, you say, how could it fail, after you just demonstrated that it ought 
to work? The reason it fails is the following: This argument has to do with the position of 
the poles in the propagators; a typical propagator is a factor 1/(k?—m? +1), the +ze due 
to the poles, and all I’m doing here is changing the rule about the poles and picking up an 
extra delta function 6(k?—m?*) as a consequence, which is the free wave coming in and 
out. What I want these free waves to represent in the gravity case are physical gravitons 
and not something wrong. They do represent waves of g? = 0 of course, but, as it turns 
out, not with the correct polarization to be free gravitons. I’d like to show it. It has to do 
with the numerator, not the denominator. You see the propagator that I wrote before, which 
was S,, times 1/(&? +i) times Sn is being replaced by S,,6(g?)S,,,. Now when I make 
q? =0 I have a free wave instead of arbitrary momentum. This should be a real graviton 
or else there’s going to be physical trouble. It ins’t; although it is of zero momentum, it 
is not transverse. It does not make any difference in understanding the point so forget one 
index in S,, — it’s a lot of extra work to carry the other index so just imagine there’s one 
index: S,,S,6(q?). This combination S,5/, is SySq—S3S3—S,S;—SpS3, where 4 is the 
time and 3 is the direction, say, of momentum of the four-vector g. Then | and 2 are trans- 
verse, and those are the only two we want. (Please appreciate I iemoved one index — I can 
make it more elaborate, but it is the same idea.) That is we want only —S,5;—S)S, instead 
of the sum over four. Now what about this extra term S,S,—S35,? Well, it is S,—S, times 
S, +53 plus S,+ S3 times S;—S,. But S,—S, is proprtional to q,5, (suppressing one index) 
because q, in this notation is the frequency and equals gg, if we assume the 3-direction is 
the direction of the momentum. So S,—S, is the response of the system to a gradient 
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otential, which we proved was zero in our invariance discussion. Therefore, we have shown 
(S4—Ss)/(Si +3) = 0 and this should be accounted for by purely transverse wave contri- 
butions. But it ins’t, and it isn’t because the proof that the response to a gradient potential 
is zero required that the other particle that was interacting was an honest free graviton. 
And four plus three in S,+.5, is not honest — it’s not transverse, it is not a correct kind 
of graviton. You see, the only way you can get a polarization 4+5 going in the 4—3 direction is 
to have what I call longitudinal response; it’s not a transverse wave. Such a wave could only 
be generated by an artificial source here of some silly kind; it is not a free wave. When 
there’s an artificial source for one graviton, even the another is a pute gradient, the sum 
of all the diagrams does not give zero. If the beam is not exactly that of a free wave, perfectly 
transverse and everything, the argument that the gradient has to be zero must fail, for the 
reason outlined previously. 

Although this gradient for S,—S, is what I want and I hoped it was going to be zero 
I forgot that the other end of it — S,+53 is a funny wave which is not a gradient, and 
which is not a free wave — and therefore you do not get zero and should not get zero, and 
something is fundamentally wrong. 

Incidentally I investigated further and discovered another very interesting point. 
There is another theory, more well-known to meson physicists, called the Yang-Mills theory, 
and I take the one with zero mass; it is a special theory that has never been investigated 
in great detail. It is very analogous to gravitation; instead of the coordinate transformation 
group being the source of everything, it’s the isotopic spin rotation group that’s the source 
of everything. It is a non-linear theory, that’s like the gravitation theory, and so forth. At 
the suggestion of Gell-Mann I looked at the theory of Yang-Mills with zero mass, which has 
a kind of gauge group and everything the same; and found exactly the same difficulty. And 
therefore in meson theory it was not strictly unknown difficulty, because it should have 
been noticed by meson physicists who had been fooling around the Yang-Mills theory. They 
had not noticed it because they’re practical, and the Yang-Mills theory with zero mass 
obviously does not exist, because a zero mass field would be obvious; it would come out 
of nuclei right away. So they didn’t take the case of zero mass and investigate it carefully. 
But this disease which I discovered here is a disease which exist in other theories. So at 
least there is one good thing: gravity isn’t alone in this difficulty. This observation that 
Yang-Mills was also in trouble was of very great advantage to me; it made everything much 
easier in trying to straighten out the troubles of the preceding paragraph, for several reasons. 
The main reason is if you have two examples of the same disease, then there are many things 
you don’t worry about. You see, if there is something different in the two theories it is not 
caused by that. For example, for gravity, in front of the second derivatives of g,, in the 
Lagrangian there are other g’s, the field itself. I kept worrying something was going to happen 
from that. In the Yang-Mills theory this is not so, that’s not the cause of the trouble, and so 
on. That’s one advantage — it limits the number of possibilities. And the second great 
advantage was that the Yang-Mills theory is enormously easier to compute with than the 
gravity theory, and therefore I continued most of my investigations on the Yang-Mills 
theory, with the idea, if I ever cure that one, I'll turn around and cure the other. Because 
I can demonstrate one thing; line for line it’s a translation like music transcribed to a different 
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score; everything has its analogue precisely, so it is a very good example to work with, 
Incidentally, to give you some idea of the difference in order to calculate this diagram Fig. 4b 
the Yang-Mills case took me about a day; to calculate the diagram in the case of gravitation 
I tried again and again and was never able to do it; and it was finally put on a computing 
machine —I don’t mean the arithmetic, I mean the algebra of all the terms coming in, just the 
algebra; I did the integrals myself later, but the algebra of the thing was done on a machine 
by John Matthews, so I couldn’t have done it by hand. In fact, I think it’s historically 
interesting that it’s the first problem in algebra that I know of that was done on a machine 
that has not been done by hand. 

Well, what then, now you have the difficulty; how do you cure it? Well I tried the 
following idea: I assumed the tree theorem to be true, and used it in reverse. If every closed 
ring diagram can be expressed as trees, and if trees produce no trouble and can be computed, 
then all you have to dois to say that the closed loop diagram is the sum of the corresponding 
tree diagrams, that it should be. Finally in each tree diagram for which a graviton line has 
been opened, take only real transverse graviton to represent that term. This then serves 
as the definition of how to calculate closed-loop diagrams; the old rules, involving a propagator 
1/k? +ie etc. being superseded. The advantage of this is, first, that it will be gauge invariant, 
second, it will be unitary, because unitarity is a relation between a closed diagram and an 
open one, and is one of the class of relations I was talking about, so there’s no difficulty. 
And third, it’s completely unique as to what the answer is; there’s no arbitrary fiddling 
around with different gauges and so forth, in the inside ring as there was before. So that’s 
the plan. 

Now, the plan requires, however, one more point. It’s true that we proved here that 
every ring diagram can be broken up into a whole lot of trees; but, a given tree is not 
gauge invariant. For instance the tree diagram of Fig. 2A is not. Each one of the four 
diagrams of Fig. 2 is not gauge-invariant, nor is any combination of them except the sum of 
all four. So the thing is the following. Suppose I take all the processes, all of them that 
belong together in a given order; for example, all the diagrams of fourth order, of which 
Fig. 4 illustrates three; I break the whole mess into trees, lots of trees. Then I must gather 
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the trees into baskets again, so that each basket contains the total of all of the diagrams of 
some specific process (for example the four diagrams of Fig. 2), you see, not just some 
particular tree diagram but the complet set for some process. The business of gathering the 
tree diagrams together in bunches representing all diagrams for complet processes is impor- 
tant, for only such a complet set is gauge invariant. The question is: Will any odd tree dia- 
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grams be left out or can they all be gathered into processes? The question is: Can we express 
the closed ring diagrams for some process into a sum over various other processes of tree 
diagrams for these processes? 

Well, in the case with one ring only, I am sure it can be done, I proved it can be done 
and I have done it and it’s all fine. And therefore the problem with one ring is fundamentally 
solved; because we say, you express it in terms of open parts, you find the processes that 
they correspond to, compute each process and add them together. 

You might be interested in what the rule is for one ring; it’s the sum of several pieces: 
first it is the sum of all the processes which you get in the lower order, in which you scatter 
one extra particle from the system. For instance, in Fig. 4 we have the rings for two particles 
scattering. There is no external graviton but there are two internal ones; now we compute 
in the same order a new problem in which there are two particles scattering, but while 
that’s happening another particle, for example a graviton scatters forward. Some of the 
diagrams for this are illustrated in Fig. 5. State f the same state as g; so another graviton 
comes in and is scattered forward. In other words we do the forward scattering of an extra 
graviton. In addition, from breaking matter lines we have terms for the forward scattering 
of an extra positron, plus the forward scattering of an extra electron, and so on; one adds 
the forward scattering of every possible extra particle together. That is the first contribution. 
But when you break up the trees, you also sometiines break two lines, and then you get 
diagrams like Fig. 6 with two extra particles scattering (here a graviton and electron) so it 
turns out you must now subtract all the diagrams with two extra particles of all kinds 
scattering. Then add all diagrams with 3 extra particles scattering and so on. It’s a nice rule, 
its’s quite beautiful; it took me quite a while to find; I have other proofs for orther cases 
that are easy to understand. 

Now, the next thing that anybody would ask which is a natural, interesting thing to 
ask, is this. Is it possible to go back and to find the rule by which you could have integrated 
the closed rings directly? In other words, change the rule for integrating the closed rings, 
so that when you integrate them in a more natural fashion, with the new method, it will 
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give the same answer as this unique, absolute, definite thing of the trees. It’s not necessary 
to do this, because, of course, I’ve defined everything; but it’s of great interest to do this, 
because maybe I’ll understand what I did wrong before. So I investigated that in detail. 
It turns out there are two changes that have to be made — it’s a little hard to explain in 
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terms of the gravitation of which I’ll only tell about one. Well, I’ll try to explain the other, 
but it might cause some confusion. Because I have to explain in general what I’m doing 
when I do a ring. Most what it corresponds to is this: first you subtract from the Lagrangian 
this 

[Vg HH, dr. 


In that way the equation of motion that results is non-singular any more. Let me write 
what it really is so that there’s no trouble. You say to me what is this, there’s a g in it and 
an Hf in it? Yes. In doing a ring, there’s a field variation over which you’re integrating, 
which I call H; and there’s a g — which is the representative of all the outside disturbances 
which can be summarized as being an effective external field g. And so you add to the 
complicated Lagrangian that you get in the ordinary way an extra term, which makes 
it no longer singular. That’s the first thing; I found it out by trial and error before, 
when I made it gauge invariant. But then secondly, you must subtract from the answer, 
the result that you get by imagining that in the ring which involves only a graviton 
going around, instead you calculate with a different particle going around, an artificial, 
dopey particle is coupled to it. It’s a vector particle, artificially coupled to the external 
field, so designed as to correct the error in this one. The forms are evidently invariant, 
as far as your g-space is concerned; these are like tensors in the g world; and therefore 
it’s clear that my answers are gauge invariant or coordinate transformable, and all that’s 
necessary. But are also quantum-mechanically satisfactory in the sense that they are unitary. 

Now, the next question is, what happens when there are two or more loops? Since 
I only got this completely straightened out a week before I came here, I haven’t had time 
to inwestigate the case of 2 or more loops to my own satisfaction. The preliminary 
investigations that I have made do not indicate that it’s going to be possible so easily 
gather the things into the right barrels. It’s surprising, I can’t understand it; when you 
gather the trees into processes, there seems to be some loose trees, extra trees. I don’t 
understand them at the moment, and I therefore do not claim that this method of 
quantization can be obviously and evidently carried on to the next order. In short, 
therefore, we are still not sure, of the radiative corrections to the radiative corrections to 
the Lamb shift, the uncertainty lies in energies of the order of magnitude of 10-*° 
rydbergs. ! can therefore relax from the problem, and say: for all practical purposes 
everything is all right. In the meantime, unfortunately, although I could retire from 
the field and leave you experts who are used to working in gravitation to worry about 
this matter, I can’t retire on the claim that the number is so small and that the thing is 
now really irrational, if it was not irrational before. Because, unfortunately, I also discov- 
ered in the process that the trouble is present in the Yang-Mills theory; and secondly 
I have incidentally discovered a tree-ring connection which is of very great interest and 
importance in the meson theories and so on. And so I’m stuck to have to continue this 
investigation, and of course you all appreciate that this is the secret reason for doing any 
work, no matter how absurd and irrational and academic it looks; we all realize that no 
matter how small a thing is, if it has physical interest and is thought about carefully enough, 
you’re bound to think of something that’s good for something else. 
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Moller: May I, as a non-expert, ask you a very simple and perhaps foolish question. 
Is this theory really Einstein’s theory of gravitation in the sense that if you would have 
here many gravitons the equations would go over into the usual field equations of Einstein? 


Feynman: Absolutely. 

Moller: You are quite sure about it? 

Feynman: Yes, m fact when I work out the fields and I don’t say in what order I’m 
working, I have to do it in an abstract manner which includes any number of gravitons; 
and then the formulas are definitely related to the general theory’s formulas; and the in- 
variance is the same; things like this that you see labelled as loops are very typical quantum- 
-mechanical things; but even here you see a tendency to write things with the right deriva- 
tives, gauge invariant and everything. No, there’s no question that the thing is the Ein- 
steinian theory. The classical limit of this theory that I’m working on now is a non-linear 
theory exactly the same as the Einsteinian equations. One thing is to prove it by equations; 
the other is to check it by calculations. I have mathematically proven to myself so many 
things that aren’t true. I’m lousy at proving things — I always make a mistake. I don’t 
notice when I’m doing a path integral over an infinite number of variables that the Lagrang- 
ian does not depend upon one of them, the integral is infinite and I’ve got a ratio of two 
infinities and I could get a different answer. And I don’t notice in the morass of things that 
something, a little limit or sign, goes wrong. So I always have to check with calculations; 
and I’m very poor at calculations — I always get the wrong answer. So it’s a lot of work 
in these things. But I’ve done two things. I checked it by the mathmatics, that the forms 
of the mathematical equations are the same; and then I checked it by doing a consid- 
erable number of problems in quantum mechanics, such as the rate of radiation 
from a double star held together by quantum-mechanical force, in several orders and 
so on, and it gives the same answer in the limit as the corresponding classical problem. 
Or the gravitational radiation when two stars — excuse me, two particles —- go by each 
other, to any order you want (not for stars, then they have to be particles of specified prop- 
erties; because obviously the rate of radiation of the gravity depends on the give of the 
starstides are produced). If you doareal problem with real physical things in in then I’m 
sure we have the right method that belongs to the gravity theory. There’s no question 
about that. It can’t take care of the cosmological problem, in which you have matter out 
to infinity, or that the space is curved at infinity. It could be done I’m sure, but I haven’t 
investigated it. I used as a background a flat one way out at infinity. 

Meller: But you say you are not sure it is renormalizable. 

Feynman: I’m not sure, no. 

Moller: In the limit of large number of gravitons this would not matter? 

Feynman: Well, no; you see, there is still a classical electrodynamics; and it’s not 
got to do with the renormalizability of quantum electrodynamics. The infinities come in 
different places. It’s not a related problem. 

Rosen: I’m not sure of this, not being one of the experts; but I have the impression 
that because of the non-linearity of the Einstein equations there exists a difficulty of the 
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following kind. If the linear equations have a solution in the form of an infinite plane mono. 
chromatic wave, there does not seem to correspond to that a more exact solution; because 
you get piling up of energies in space and the solution then diverges at infinity. Could that 
have any bearing on the accuracy of this kind of calculation? 

Feynman: No, I take that into account by a series of corrections. A single graviton 
is not the same thing as an infinite gravitational wave, because there’s a limited energy 
in it. There’s only one hw. 

Rosen: But you’re using a momentum expansion which involves infinite waves. 

Feynman: Yes, there are corrections. You see what happens if one calculates the cor- 
rections. If you have here a graviton coming in this way, then there are corrections for such 
a ring as this and so on. And these produce first, a divergence as usual; but second, a 
term in the logarithm of q?; which means that if this thing is absolutely a free plane 


wave, there’s no meaning to the correction. So it must be understood in this way, that 
the thing was emitted some time far in the past, and is going to be absorbed some time 
in the future; and has not absolutely been going on forever. Then there’s a very small 
coefficient in front of the logarithm and then for any reasonable g?, like the diameter of 
the universe or something, I can still get a sensible answer; this is the shadow of the 
phenomenon you're talking about, that the corrections to the propagation of a graviton, 
dependent on the logarithm of the momentum squared carried by the graviton and which 
would be infinite if it were really a zero momentum graviton exactly. And so a free 
graviton just like that does not quite exist. And this is the correction for that. Strictly we 
would have to work with wave packets, but they can be of very large extent compared to 
the wave length of the gravitons. 

Anderson: I'd like to ask if you get the same difficulty in the electromagneti¢ case 
that you did in the Yang-Mills and gravitational cases? 

Feynman: No, sir, you do not. Gauge invariance of diagrams such as Fig. 2 (there 
is no 2D) is satisfied whether 6 is a free wave or not. That is because photons are not the 
source of photons; they are uncharged. 

Anderson: The other thing I would like to suggest is that in putting of things into 
baskets, you might be able to get easily by always only starting out with vacuum dia- 
grams and opening those successively. 

Feynmau: t tried that and it didn’t go successfully. 


Ivanenko: If I understood you correctly, you had used in the initial presentation the 
transmutation of two particles into gravitons. Yes? 


Feynman: It was one of the examples. 
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Ivanenko: Yes. This process was considered, perhaps in a preliminary manner, by 
ourselves and by Prof. Weber and Brill. I ask you two questions. Do you possess the effective 
cross-section? Can you indicate the effects for which high-energy processes play an impor- 
tant role? 

Feynman: I never went to energies more than one billion-billion BeV. And then the 
cross-sections of any of these processes are infinitesimal. 

Ivanenko: They increase very, very sharply with energy. Yes, because the radiation 
is quadrupole, so it increases sharply in contrast to the electromagnetic transmutation of 
an electron-positron pair. 

Feynman: It increases very sharply indeed. On the other hand, it starts out so low that 
one has to go pretty far to get anywhere. And the distance that you have to go is involved 
in this thing — the thing that’s the analogue of e?/hc in electricity, which is 1/137 is non- 
-existent in gravitation; it depends on the problem; this is so because of the dimensions of 
G. So if E is the energy of some process, then if you take GE2/hc you get an equivalent to 
this e?/he. It may be less than that, but at least it can’t be any bigger than this. So in order 
to make this thing to be of the order of 1%, in which case the rate is similar to the rate of 
photon annihilation, at ordinary energies, we need the GE? to be of the order of he, and 
as has been pointed out many times, that’s an energy of the order 10-5 grams, which is 
108 BeV. You can figure out the answer right away; just take the energy that you are interes- 
ted in, square, multiply by G and divide by he; if that becomes something, then you’re 
getting somewhere. You still might not get somewhere, because the cross-section might 
not go up that fast, but at least it can’t get up any worse than that. So I think that in order 
to get an appreciable effect, you’ve got to go to ridiculous energies. So you either have 
a ridiculously small effect or a ridiculous energy. 

Weber: I have a cross-section which may be a partial answer to Ivanenko’s question. 
Could I write it on the board? We have carried out a canonical quantization, which is not 
as fancy as the one you have just heard about; but considering the interaction of photons 
and gravitons; and it turns out that even in the linear approximation that one has the 
possibility of the graviton production by scattering of photons in a Coulomb field. And the 
scattering cross-section for this case turns out to be 8x? times the constant of gravitation 
times the energy of the scatterer times the thickness of the scatterer in the direction of 
propagation of the photon through it divided by c+. This assumes that all of the dimensions 
of the scatterer are large in comparison with the wave length of the photon. We obtained 
this result by quantization, and noticed that it didn’t have Planck’s constant in it, so we 
turned around and calculated it classically. Now, if one puts numbers in this, one finds 
that the scattering cross-section of a galaxy due to a uniform magnetic field through it is 
108 cm?, a much larger number than the object that you talked about. This represents 
a conversion of photons into gravitons of about 1 part in 1018. This is of course too small 
to measure. Also, we considered the possibility of using this cross-section for a laboratory 
experiment in which one had a scatterer consisting, say of a million gauss magnetic field 
over something like a cubic meter. This turned out to be entirely impossible, a result in 
total contradiction to what has appeared in the Russian literature. In fact, the theory of 
fluctuations shows that for a laboratory experiment involving the production of gravitons 
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by scattering of photons in a Coulomb field, the scattered power has to be greater than twice 
the square root of kT times the photon power divided by the averaging time of the experi- 
ment. I believe that the incorrect results that have appeared in the literature have been 
due to the statement that AP has to be greater than kT over t; dimensionally these things 
are the same, but order of magnitude-wise this kind of experiment for the scatterer of which 
I spoke requires something like 10°° watts. Maybe I can say something about this 
afternoon; I don’t want to take any more time. 

De Witt: I should like to ask Prof. Feynman the following questions. First, to give us 
a careful statement of the tree theorem; and then outline, if he can to a brief extent, the 
nature of the proof of the theorem for the one-loop case, whish I understand does work, 
And then, to also show in a little bit more detail the structure and nature of the fictitious 
particle needed if you want to renormalize everything directly with the loops. And if you 
like, do it for the Yang-Mills, if things are prettier that way. 

Feynman: I usually don’t find that to go into the mathematical details of proofs 
in a large company is a very effective way to do anything; so, although that’s the question 
that you asked me — I'd be glad to do it — I could instead of that give a more physical 
explanation of why there is such a theorem; how I thought of the theorem in the first place, 
and things of this nature; although I do have a proof — I’m not trying to cover up. 

De Witt: May we have a statement of the theorem first? 

Feynman: That I do not have. I only have it for one loop, and for one loop the careful 
statement of the theorem is... — look, let me do it my way. First — let me tell you how I 
thought of this crazy thing. I was invited to Brussels to give a talk on electrodynamics — 
the 50th anniversary of the 1911 Solvay Conference on radiation. And I said I’d make 
believe I’m coming back, and I’m telling an imaginary audience of Einstein, Lorentz and 
so on what the answer was. In other words, there are going to be intelligent guys, and I'll 
tell them the answer. So I tried to explain quantum electrodynamics in a very elementary 
way, and started out to explain the self-energy, like the hydrogen Lamb shift. How can 
you explain the hydrogen Lamb shift easily? It turns out you can’t at all — they didn’t 
even know there was an atomic nucleus. But, never mind. I thought of the following. I would 
explain to Lorentz that his idea that he mentioned in the conference, that classically the 
electromagnetic field could be represented by a lot of oscillators was correct. And that 
Planck’s idea that the oscillators are quantized was correct, and that Lorentz’s suggestion, 
which is also in that thing, that Planck should quantize the oscillators that the field is equi- 
valent to, was right. And it was really amusing to discover that all that was in 1911. And 
that the paper in which Planck concludes that the energy of each oscillator was not nhw 
but (7 +1/2)hw which was also in that, was also right; and that this produced a difficulty, 
because each of the harmonic oscillators of Lorentz in each of the modes had a frequency 
of hw/2 which is an infinite amount of energy, because there are an infinite number of modes. 
And that that’s a serious problem in quantum electrodynamics and the first one we have 
to remove. And the method we use to remove it is to simply redefine the energy so that 
we start from a different zero, because, of course, absolute energy doesn’t mean anything. 
(In this gravitational context, absolute energy does mean something, but it’s one of the 
technical points I can’t discuss, which did require a certain skill to get rid of, in making 
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a gravity theory; but never mind.) Now look —I make a little hole in the box and I let in 
a little bit of hydrogen gas from a reservoir; such a small amount of hydrogen gas, that 
the density is low enough that the index of refraction in space differs from one by an amount 
proportional to A, the number of atoms. With the index being somewhat changed, the 
frequency of all the normal modes is altered. Each normal mode has the same wavelenght 
as before, because it must fit into the box; but the frequencies are all altered. And there- 
fore the hw’s should all be shifted a trifle, because of the shift of index, and therefore there’s 
a slight shift of the energy. Although we subtract fw/2 for the vacuum, there’s a correction 
when we put the gas in; and this correction is proportional to the number of atoms, and 
can be associated with an energy for each atom. If you say, yes, but you had that energy 
already when you had the gas in back in the reservoir, I say, but let us only compare the 
difference in energy between the 2S and 2P state. When we change the excitation of the 
hydrogen gas from 2S to 2P then it changes its index without removing anything; and the 
energy difference that is needed to change the energy from 2S to the 2P for all these atoms 
is not only the energy that you calculate with disregard of the zero point energy; but the 
fact is that the zero point energy is changed very slightly. And this very slight difference 
should be the Lamb effect. So I thought, it’s a nice argument; the only question is, is it 
true. In the first place it’s interesting, because as you well know the index differs from one 
by an amount which is proportional to the forward scattering for y rays of momentum k 
and therefore that shift in energy is essentially the sum over all momentum states of the 
forward scattering for y rays of momentum k. So I looked at the forward scattering and 
compared it with the right formula for the Lamb shift, and it was not true, of course; it’s 
too simple an argument. But then I said, wait, I forgot something. Dirac, explained to us 
that there are negative energy states for the electron but that the whole sea of negative 
energy states is filled. And, of course, if I put the hydrogen atoms in here all those electrons 
in negative energy states are also ascattering off the hydrogen atoms; and therefore their 
states are all shifted; and therefore the energy levels of all those are shifted a tiny bit. And 
therefore there’s shift in the energy due to those. And so there must be an additional term 
which is the forward scattering of positrons, which is the same as scattering of negative 
energy electrons. Actually, for the symmetry of things it is better to take half the case where you 
make the positrons the holes and the other half where you make the electrons the holes; 
so it should be 1/2 forward scattering by electrons, 1/2 scattering by positrons and scattering 
by y rays — the sum of all those forward scattering amplitudes ought to equal the self- 
-energy of the hydrogen atom. And thats’ right. And it’s simple, and it’s very peculiar. 
The reason it’s peculiar is that these forward scatterings are real processes. At last I had 
discovered a formula I had always wanted, which is a formula for energy differences (which 
are defined in terms of virtual fields) in terms of actual measurable quantities, no matter 
how difficult the experiment may be —I mean I have to be able to scatter these things. Many 
times in studying the energy difference due to electricity (I suppose) between the proton 
and the neutron, I had hoped fora theorem which would go something like this —this energy 
difference between proton and neutron must be equal to the following sum of a bunch of 
cross-sections for a number of processes, but all real physical processes, I don’t care how 
hard they are to measure. So this is the beginning of such a formula. It’s rather surprising. 
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It’s not the same as the usual formula — it’s equal to it but it’s not the same. I have no 
formulation of the laws of quantum gravidynamics; I have a proposal on how to make the 
calculations. When I make the proposal on how to do the closed loops, the obvious proposal 
does not work; it gives non-unitarity and stuff like that. So the obvious proposal is no good; 
it works O.K. for trees; so how am I going to define the answer for would correspond to 
a ring? The one I happen to have chosen is the following: I take the ring in general for any 
meson theory, one closed ring can be written as equivalent to a whole lot of processes each 
one of which is trees. I then define, as my belief as to what the ring ought to be in the grand 
theory, that it’s going to be also equal to the corresponding physical set of trees. When I said 
this is equal to this. I didn’t worry about gauge or anything else; what I means was, if these 
weren’t gravitons but photons or any other neutral object — it doesn’t make any difference 
what they are — this theorem is right. So I suppose it’s right also for real gravitons, and 
I suppose also that what’s being scattered is only transverse and is only a real free graviton 
with g? = 0. Therefore, I say let this ring equal this set of trees. Every one of these terms 
can be completely computed — it’s a tree. And it’s gauge invariant; that is, if I added an 
extra potential on the whole thing, another outside disturbance of a type which is nothing 
but a coordinate transformation — in short a pure gradient wave — to the whole diagram 
then it comes on to all of these processes; but it makes no effect on any of them, and therefore 
makes no effect on the sum; and therefore I know my definition of this ring is gauge-invariant. 
Second, unitarity is a property of the breaking of this diagram; the imaginary part of this 
equals something; if you take the imaginary part of this side, it’s already broken up, in fact, 
and you can prove immediately that it’s the correct unitarity rule. Therefore it’s going to 
be unitarity and so on and so on. And so I therefore define gravity with one ring in this 
way. Now what prevents me from doing it with two rings? The lack of a complet statement 
of what two rings is equal to in terms of processes; that is I can open the ring all right; but 
I can’t put the pieces — the broken diagrams — back together again into complete sets 
that each one is a complete physical process. In other words some of them correspond to 
the scattering of a graviton, but leaving out some diagrams. But the scattering of a graviton 
leaving out diagrams is no longer gauge invariant, I mean, not evidently gauge invariant, 
and so the power of the whole thing collapses. I don’t know what to do with it. So that’s 
the situation; that’s why it is crucial to the particular plan. There’s always, of course, another 
way out. And that’s the following (and that’s what I tried to describe at the end of the talk — 
maybe I talked too fast): After all now I’ve defined what this results is equal to — by definition 
not that you should do a loop some way and get this, but that a loop is equal to this by 
defition, and I’m not going to do a loop any other way. But, of course, from a practical 
point of view or from the point of view purely of interest, the question is, can you come 
back now and calculate the ring directly by some particular mathematical shenanigans, 
and get the same answer as you get by adding the trees. And I found the way to do that. 
I have another way, in other words, to do the ring integral directly. I have to subtract 
something from a vector particle going around the instead of a graviton to get the answer 
right. So I known the rule, and I know why the rule is, and I have a proof of the rule for 
one loop. I have two ways of extending. I can either break this two loop diagram open and 
get it back into the processes, like I did with the one ring — where so far I’m stuck. Or, 
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I can take the rule which I found here and try to guess the generalization for any number 
of rings. Also stuck. But I’ve only had a week, gentlemen; I’ve only been able to straighten 
out the difficulty of a single ring a week ago when I got everything cleaned up. It’s more 
than a week — I had to take a lot of time checking and checking; but I was only finished 
checking to make sure of everything for this conference. And of course you’re always 
asking me about the thing I haven’t had time to make sure about yet, and I’m sorry; I worked 
hard to be sure of something, and now you ask me about those things I haven’t had time. 
I hoped that I would be able to get it. I still have a fewirons to try; I’m not completely stuck— 
maybe. 

DeWitt: Because of the interest of the tricky extra particle that you mentioned at the 
end, and its possible connection, perhaps, with some work of Dr Bialynicki-Birula, 
have you got far enough on that so that you could repeat it with just a little more detail? 
The structure of it and what sort of an equation it satisfies, and what is its propagator? 
These are technical points, but they have an interest. 

Feynman: Give me ten minutes. And let me show how the analysis of these tree 
diagrams, loop diagrams and all this other stuff is done mathematical way. Now I will show 
you that I too can write equations that nobody can understand. Before I do that I should 
like to say that there are a few properties that this result has that are interesting. First of 
all in the Yang-Mills case there also exists a theory which violates the original idea of symmetry 
of the isotopic spin (from which was originally invented) by the simple assumption that 
the particle has a mass. That means to add to the Lagrangian a term —y?a,a" where a, 
is an isotopic vector. You add this to the Lagrangian. This destroys the gauge invariance 
of the theory — it’s just like electrodynamics with a mass, it’s no longer gauge-invariant, 
it’s just a dirty theory. Knowing that there is no such field with zero mass people say: ,,let’s 
put the mass term on”. Now when you put a mass term on it is no longer gauge invariant. 
But then it is also no longer singular. The Lagrangian is no longer singular for the same 
reason that it is not invariant. And therefore everything can be solved precisely. The propa- 
gator instead of being 6,, between two currents is 


bww—Fu Jo] 
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where q,, is the momentum of propagating particle. The factor 1/(q?—2*) is typical for mass u 
but the part —q,q,/u? is an important term which can be taken to be zero in electrodynamics 
but it is not obvious whether it can be taken to be zero in the case of Yang-Mills theory. 
In fact it has been proved it cannot be taken to be zero; this propagator is used between two 
currents. I am using the Yang-Mills example instead of the gravity example. I really want 
only the case uz? = 0, and am asking whether I can get there by first calculating finite y?, 
then taking the limit uw? = 0. 

Now, with uw? 40 this is a definite propagator and there are no ambiguities at the 
closed rings, the closed loops. I have no freedom, I must compute this propagator. I mean 
there is no reason for trouble, and there is no trouble. There is no gauge invariance either. 

And of course I checked. I broke the rings and I computed by the broken ring theorem 
method a closed loop problem of fair complexity (which in fact was the interaction of two 
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electrons). I computed it by the open ring method and by the closed ring method, and 
of course it agreed, there is no reason that it shouldn’t. It turned out that for tree diagrams 
you don’t have to worry about this g,q,/u? term, you can drop it — but not for the closed 
ring — only for tree. Therefore the tree diagrams have the definite limit as yz? goes to zero, 
And yet I have the closed ring diagram which is equal to the tree diagram when the mass 
is anything but zero, and therefore it ought to be true that the limit as 4? goes to zero of the 
ring is equal to the case when w = 0. It sounds like a great idea why don’t you define the 
desired u? = 0 theory that way? Answer: You can’t put yu? equal zero in the form (10). 
You can’t do it because of the q,,q,/u?. So it was necessary next to see if there is a way to 
re-express the ring diagrams, for the case with uw? 0, in a new form with a propagator 
different from (10), that didn’t have a w? in it, in such a form that you can take the limits 
as wu” goes to zero. Then that would be a new way to do the w equal zero case; and that’s 
the way I found the formula. I'll try to explain how to find that theory. 

We start with a definite theory, the Yang-Mills theory with a mass (the reason I do that 
is that there’s no ambiguity about what I am trying to do) and later on I take the mass to zero, 
then the theory works something like this. You have the Lagrangian (A, @) which involves 
the vector potential of this field and the fields gy representing the matter with which this 
object is interacting for zero mass, to which, for finite mass we add the term u?4,,4,,. This 
is the Lagrangian that has to be integrated and the idea is that you integrate this over all 
fields A and g; and that is the answer for the amplitude of the problem 


> if ef LAr Pde + HA yA yale DADo. (11) 


But wait, what about the initial and final conditions? You have certain particles coming 
in and going out. To simplify things (this is not essential) I'll just study the case that corres- 
ponds only to gravitons in and out. Pll call them gravitons and mesons even though they 
are vector particles. The question is first, what is the right answer if you have gravitons 
represented by plane waves, 4,, dy, Ag ... going in (positive frequency in 4,) or out (nega- 
tive frequency). You make the following field up. Let A,,,., be defined as a times the wave 
function A, that represents the first graviton coming in a plane wave, plus f times A, plus y 
times A, and so on. 


Aasym = @A,+BAg+y..., A -> Aag (12) 


as 
Then you calculate this integral (11) subject to the condition that A approaches Ajgym 
at infinity. The result of this is of course a function of a, B, y ... and so on. Then what you 
want for X is just the term first order in «, 8, y .... That means just one of each these gravi- 
tons coming in and out. That’s the right formula for a regular theory, for meson theory, 
You calculate the integral subject to the asymptotic condition, when you imagine all these 
waves, but you take the first order perturbation with respect to each one of the incoming 
waves. You never let the same photon operate twice; a photon operating twice is not a photon, 
it is a classical wave. So you take the derivative of this with respect to a, B, y and so on, 
then setting them all equal to zero. That’s problem. (In general there’s g asymptotic 
too.) 
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Now the way I happened to do this is the following: Let us call 4 the A which satisfies 
the classical eqatiuons of motion, which in this particular case will be 


IL 
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I solve this subject to the condition that A, equals A,,,,,. In other words, I find what is the 
maximum or minimum — whatever it is — of the action in (11), subject to the asymptotic 
condition. That’s the beginning of analysing this. 

The next thing is to make the simple substitution 4 = 4)+Band put it back in equa- 
tion (11). Then if you take £ od A, +B (if B is negligible you get L of Ay and so forth) 
so you get something like this 


LO(A,) +u*AoAy (A94.B)— (A) + 2BB -+2u2AB 
ell )+H7A A f at +B)— (AI +H He DB. (14) 
The integral is over all B, and B must go to zero asymptotically. This business can be expanded 
in powers of B. 


L(A +B)~L(A) +u2BB +2u2AB = Quad (B) +Cubic (B) +...+u2BB. (15) 


The zeroth power B is evidently zero. The first power of B is also zero because Ay minimized 
the original thing. So this starts out quadratic in B plus cubic in B plus etc., that’s what 
this is here. These quadratic forms Quad (B) and so on of course depend on Ag, the cubic 
form involves Ay in some complex, maybe very complicated, locked-up mess, but as far 
as B is concerned it is second power and higher powers. 

Now I would like to point something out. First — it turns out if you analyze it, that the 
contribution of the first factor here alone (if you had forgotten the intergal and called it one) 
is exactly the contribution of all trees to the problem. So that’s like the classical theories 
related to trees. Next, if you drop the term cubic in B in the exponent completely and just 
integrated the result over DB, that corresponds to the contribution from one ring, or from 
two isolated rings, or three isolated rings, but not interlocked rings. If you start to include 
the cubic term is has to come in a second power to do anything, because of the evenness 
and oddness of function. And as soon as it comes in second power, the cubic term, having 
three of these things come together twice, makes a terrible thing like oo which is a double 
ring. So you don’t get to a double ring until you bring a cubic term down to the second 
order. So if I disregard that and just work with this second order term Quad (B) +?BB, 
I’m studying the contribution from one ring. If I study this I am working from the trees. 
And now you see I have in my hands an expression for the contribution of a ring correct 
in all orders no matter how many lines come in. I also have expressions for the contributions 
from trees and so on. I can compare them in different mathematical circumstances, and 
it’s on this basis that I have been able to prove everything I have been able to prove relating 
one ring to trees. 

Now, let me explain how the theorem was obtained that takes the case for the mass 
and for a ring. Now we have to discuss a ring, which is a formula like this 
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The quadratic form involves Ay so the answer depends on Ay — it’s some complicated 
functional of Ag. Anyway I won’t say that all the time, I'll just remember that. We have 
to integrate over all B. And the difficulty is — not difficulty, but the point is — that this 
quadratic form in B is singular, because it came from the piece of the action that has an 
invariance and this invariance keeps chasing us along. And there are certain transformations 
of B which leave this Quard B part unchanged in first order. That transformation in the 
Yang-Mills theory is 


B, = B,+pa+(«@xA) =B,+4,,. (16) 


where the vectors are in isotopic spin space and & is considered as first order. This trans. 
formation leaves the quadratic form invariant so the Quad (B) thing by itself is singular, 
But it doesn’t make any difference, because of the addition of the w?BB. If u? 40, there 
is no problem, but if u?-»>0, I'd be in trouble. 

I discovered that if I make this change (16) in the actual Lagrangian and carry everything 
up to second order it is exact, in fact because it’s only second order. If I do it with the 
exact change, the thing isn’t invariant, it is only invariant to first order in a. But if I make 
the substitution exactly, then I get a certain addition to the Lagrangian, in other words the 
Lagrangian of B’ (this includes the y?, the Lagrangian plus the yz? term in B) is the Lagrangian 
plus the u? term in B plus something like this 


1 
By? Oy + > May Ayn 


I have to explain that the semicolon is analogous to the semicolon in gravity. The semicolon 
derivative X,,, means the ordinary derivative of X minus A cross X and that’s the analogue 
of the Christoffel symbols. Anyway, I find out what happens to L when I make this trans- 
formation. Now comes the idea, the trick, the nonsense: you start with the following thing; 


you, say, suppose instead of writing the original terms down, instead of writing the original 
Lagrangian I were to write the following: 


1 
LB) + = (By Sy te? 
H) e Q use Fy TB DaDB. 


Now I say that the integral over & is some constant or other. So all I have done is to multiply 
my original integral by L of B (by L of BI mean the whole thing, I mean this whole thing 
is going to be L of B). If I can claim that when I integrate « I get something which is inde- 
pendent of B, which is not self-evident. If I integrate over all « it does not look as if it is 
independent of B — but after a moment’s consideration you see that it is. Because if I can 
solve a certain equation, which is a, —y?a = BY, I can shift the value of « by that amount, 
and then this term would disappear. In other words if I can solve this, and call this solution % 
and change @ to a, then the B would cancel and it would only be @ here. I did it a little 
abstractly which is a little easier to explain, therefore, this term that I’ve added can be 
thought of as an integral of the following nature: Integral of some B, plus an operator acting 
on & (this complicated operator is the second derivative and so on) squared De. And then 
by that substitution I’ve just mentioned, this becomes equal to 1/2 the operator on A 
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times @ squared Da, which is equal to the integral e to the one half of « times A, the 
operator A, times the operator A times & integrated over primed @. Now when you inte- 
grate a quadratic form, which isa quadratic with an operator like this you get one over 
the square root of the determinant of the operator. So this thing is one over the square root 
of the determinant of the operator AA. The determinant of the operator A times 4 is 
square of the determinant of A. So this is one over the determinant of the operator 4, or 
better it is one over square root of the determinant of the operator A squared, you'll see 
in a minute why I like to write it in this way. In other words, when I’ve written this thing 
down I’ve written the answer that I want. Let’s call X the unknown answer that I want. 
Then this is equal to X divided by this determinant’s square root squared. Now comes the 
trick — I now make the change from B to B’. We notice that B changed to B’ is simply... 
oh!, this is wrong, that’s what’s wrong, it should be just this. Now I’ve got it. The change 
from B’ to B is to add something to B. Therefore to the differential of B it adds nothing, 
it’s just shifting the B to a new value. So I make the transformation from B to B’ everywhere. 
So then I have daw and dB, and now I have a new thing up here where I make use of the 
formula for 2 of B’: 


; 1 
L(B’) = LB) + wPBy ey + a HPCs 


You see there is a certain cross term generated here and another cross term coming from 
expanding this out and the net result, witha little algebra here, is that becomes L of B, but 
the quadratic term doesn’t cancel out and is left; there’s one half of B,, squared; that’s 
from this term; the cross term here cancels the cross term in there; and then we have only 
the quadratic — I mean the @ terms 
i, LB)+ 5 Bp, wo CCE ute) Dey 
e “DBe 

And the problem is now to do this integral on «; well, another miraculous thing happens. 
I have the operator A, but that this down thing is «Aq, and therefore its result is just determi- 
nant once; or the square of this integral is equal to this determinant, or something like 
that. Therefore, when you get all the factors right, X, the unknown, is equal to 


1 . we 
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Sachs: I want to ask a question about long-range hopes. Perhaps for irrational reasons 
people are particularly interested in those parts of the theory where is a possibility of real 
qualitative differences: what do the coordinates or topology mean in a quantized theory, 
and this kind of junk. Now I wonder if you think that this perturbation theory can eventually 
be jazzed up to cover also this kind of questions? 

Feynman: The present theory is not a theory as it is incomplete. I do not give a rule 
ou how to do all problems. I expect of course that if I spend more time on figuring out how 
to untangle the pretzels I shall be able to make it into such a theory. So let’s suppose I did. 
Now you can ask the question would the completed job, assuming it exists, be of any interest 
to esoteric question about the quantization of gravity. Of course it would be, because it 
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would be the expression of the quantum theory; there is today no expression of the quantum 
theory which is consistent. You say: but it’s perturbation theory. But it isn’t. I worked 
on the thing analyzing it in the series of increasing accuracy, but that’s only, obviously, when 
I am doing problems and checking, or doing things like I just did. But even there I haven’t 
said how many times the vector potential 4, is attacking the diagram, there is no limit to 
what order of external lines are involved in the calculation of Ay, for example. And so if 
I get my general theorem for all orders, Ill have some kind of a formulation. The fact is, 
that in such things as electrodynamics and other theories, it has not been possible to figure 
out the consequences of the quantum field theory in the case of strong interactions, because 
of technical difficulties which are not technical difficulties just of the gravitation theory, 
but exist all over the quantum field theory. I do not expect that the gravitational problems 
will be any easier in that region than they are in any other field theory, so I can say very 
little there. But at least one should certainly formulate the theory that you’re trying to 
calculate first, and then find out what the consequences are, before trying to do it the other 
way round. So I think that you'll be frustrated by the difficulties that do appear whenever 
any theory diverges. On other hand, if you ask about the physical significance of the quanti- 
zation of geometry, in other words about the philosophy behind it; what happens to the 
metric, and all such questions, those I believe will be answerable, yes. I think you would 
be able to figure out the physics of it afterwards, but I won’t to think about that until I have 
it completely formulated, I don’t want to start to work out the anser to something unless 
I know what the equation is I am trying to analyze. But I dont’ have the doubt that you 
will be able to do something, because after all you are describing the phenomena that you 
would expect, and if you deseribe the phenomena then you expect you can then find some 
kind of framework in which to talk to help to understand the phenomena. 


RICHARD P, FEYNMAN 


Closed Loop and Tree Diagrams 


INTRODUCTION 


It is the purpose of this paper to discuss some connections between the mathe- 
matical formulas for the amplitude of various processes in field theory when 
calculated in perturbation theory. The diagrams representing such amplitudes 
are generally of two forms; either the lines are interconnected so that a number 
of closed loops are formed, or there are no such loops. In the latter case we call 
the diagram a tree diagram. Examples of each type are given in Figure |. For 
the moment we shall not be concerned with exactly what particles form the loops 
and the external lines; there is to be no implication that all parts of the loop 
represent propagation of the same particle. 

In a tree diagram (Figure 1a), all the four-momenta of propagation of the 
internal lines are determined by the four-momenta of the external lines. Such a 
diagram can be directly evaluated (for incoming and outgoing states of given 
momenta) without any integration. 

A diagram containing one closed loop (Figure 1b) represents a mathematical 
term involving an integration over a single four momentum which is not 
determined by the external momentum. It is, for example, the momentum of one 
of the propagators in the loop. If there are other loops, the term represented has 
further four momentum integrals; if the diagram has n independent loops, there 
is a 4n-fold integral over the four components of each of n independent 
momentum variables. 

We shall show that any diagram with closed loops can be expressed in terms 
of sums (actually integrals) of tree diagrams. In each of these tree diagrams 
there is, in addition to the external particles of the original closed loop diagram, 
certain particles in the initial and in the final state of the tree diagram. These 


867 


868 


356 R. P. FEYNMAN 


bX 


(c} 


FIGURE I, 

(a) Tree diagram. 

(6) Diagram involving a single closed loop. 

(c) Diagram containing three independent closed loops. 


additional particles are of the physical kind that appeared in the closed loop. 
For example, if an electron line appears, the tree diagrams into which the loop is 
converted may contain an extra external line entering and Jeaving, representing 
a positron of three-momentum P, and of energy Vm? + P-P on the electron 
mass shell m,. The incoming and outgoing positron state is the same. That is, 
these diagrams are diagrams appearing in the forward scattering amplitude for 
the extra particles, scattering in the presence of the external particles. The sum 
on the tree diagrams is the sum for the two spin states of the extra positron and 
the integral over all momenta P of that positron. 

If each individual closed loop diagram can be so expressed as tree diagrams, 
it is clear that the sum of all of the diagrams for a given process in any order can 
also be expressed in tree diagrams. But an interesting question arises as to 
whether the resulting multitude of tree diagrams can be represented as a sum of 
sets of tree diagrams, each set representing the complete set of tree diagrams 
expected for some given physical process. (In many cases it can be done and we 
will discuss these matters in the latter part of this paper.) In this way we obtain 
relations among the diagrams for various processes. As a special example of such 
a relation, we are all familiar with the unitarity relation which, for example, 
relates the imaginary part of a forward scattering to the integral over the square 
of certain particle production amplitudes. Our relations are more general. 

These theorems are very close to those developed by Cutkosky. We are not 
sure whether they may be of any practical use in understanding field theory or 
strong interactions, but they have helped the author to resolve some ambiguities 
in the quantum theory of gravitation and of Yang-Mills vector mesons of zero 
mass. This application appears in the following paper. 


REDUCTION OF A SINGLE LOOP TO TREES 


A typical single closed loop diagram, like that of Figure 1b, will lead to an 
integral of the form 


d*k 
A=| Gay NOT T1290) (1) 
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where N(k), the numerator, is some polynomial in the components of the four 
vector k,, and /(k) is a factor for the i-th virtual line of the form 


i 


Ik) = 
"O ~ Ey = mitt 


(2) 
in the limit 7» +0, where m, is the mass of the i-th particle and p, is some 
momentum, being the sum of the momenta brought in by the external lines 
starting from some point on the loop and adding up to line / (that is, p;41 = py + 
the momentum brought into the loop by the external Jine at the junction of 
propagator i and i + 1). Any spin 4 particles have had their usual propagator 
Ng — m)~* rationalized to (p + m)(p? — m®)~1. The character of the poles of 

, is defined in the usual way by the in. It insures that positive frequencies are 
propagated forward in time, and negative frequencies backward. We shall 
suppose the integral converges; there are enough powers of k in the denom- 
inator D to offset those in N(k) and d*k. If they do not converge, the physical 
difficulties are offset by one or another convergence techniques (such as differ- 
entiating with respect to some of the m? and integrating back again) which 
relate the integral to others of the same type which do converge. These things 
can also be done with our theorems but we will not discuss them here. 

If we write k, = (w, k), p: = (EZ, p), our factor (2) is 


i 
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where «; is written for +~V(k — p,)? + m? and we have written a new in to 
explain the position of the poles of w, viz. at E, + «just below the real axis, and 
at E, — «,just above. Thus, there are poles both above and below the real axis in 
w, the integral (from d*k = dw d*K) along the real axis. By changing the location 
of the pole in the last term, we can also write 


IP(k) = Ik) + 2 ow — E+ «) (4) 
4 
where 
Siggy SO a 1 
Tf (k) 2e, la _- E,) = + in (w >. E)) + € as 5| (5) 
and we have used 
i 1 1 af 
26, le —E)teat+m @—-E)t+a—m 5 | =, %@- 4 be). ©) 


If each /, in the integral A of equation (1) is replaced by the sum (4), we shall 
have an integral A’, like A, but all the 7, replaced by J, plus a residual in which 
one or more of the factors replacing 7® has a & function. This integral A’ is 
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evidently zero, for all the poles are below the real axis of w, so the contour can be 
closed in w by a large semicircle above the real axis yielding zero for the integral 
(if the original integral on w converges, the integral on the semi-circle approaches 
zero for large enough radius). Thus, the integral A is expressible by a sum of 
terms each of which contains at least one 6 function. But such terms are related 
to tree diagrams. 

If a term has just one 6 function, as in (6), the integral over w can of course be 
done immediately; contributions appear only for 


w-E,= V(K — P,)? + mi, (7) 


that is, for such w that particle i is on its mass shell (as an antiparticle). That is, 
this equation (7) can be used to substitute for w. The integrand, if all the other 
factors are J,’s is just like the amplitude for a tree diagram for the same external 
lines but with one extra antiparticle of the type 7 of momentum K’ = K — P,, 
energy V(K’)* + m? entering and leaving (for example, see Figure 2b). The 
remaining integral over d°K (or equivalently over d°K’) is an integral over all 
momenta of the extra antiparticle. It is not precisely the amplitude for a tree 
diagram because the propagator in the remaining lines of the loop still present is 
Ip instead of J,. But that we shall remedy in just a moment. 

A term with 6 functions arising from two of the factors, with the rest of the 
factors replaced by Jp, becomes a disjoint diagram (if there is just one closed 
loop) consisting of the product of two separate pieces (see Figure 2c). One kind 
of antiparticle comes in with momentum K’, say, and another goes out with 
momentum Q, which is entirely determined by K’ and the external lines. In the 
other factor the antiparticle Q goes in and K’ comes out. We sum all polariza- 
tions and integrate over all momenta K’. Another way of expressing this, 
disregarding the fact that the diagram is disjoint, is to say we have a diagram 
with two antiparticles K’, Q coming in, and two going out. All the couplings of 
the various particles must be exactly as prescribed by the connections in the 
original closed loop. More & functions correspond to more extra antiparticles in 
and out. 

The fact that the propagators in these tree diagrams is Jp instead of J, may be 
an annoyance, but it is easily remedied by substituting in these tree diagrams for 
Ip Via equation (4) reversed: 


Ik?) = 1,(k?) — =; Aw + VK? + m?) (8) 


VK? + m2 a 
where 
1 ,(k?) = (k? — m? + in)7}. 


The result is, algebraically, best seen by writing, first, in place of equation (1), 
the relation A’ = 0, valid because all the poles are on one side of the axis: 


A’ = [ ini Ne] [= =0 (9) 
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and substituting (8) in this 


d*k ihe E 
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x 8 — E, + VK — PP + mi) =0. (10) 


On expanding the product (10), the term with all 7, is the closed loop (A, 
equation 1) desired, whereas all the other terms are the diagrams with various 
numbers of extra antiparticles scattering forward (each times —1 for each extra 
antiparticle), and this time all the remaining propagators are the standard /,. 
Since the sum vanishes, we have expressed the closed loop in terms of tree 
diagrams. 


THE PROCESS OF OPENING LOOPS 


Let us call this process, using the principle of equation (10) to express the loop 
in terms of diagrams with extra particles (that is, integrals with at least one 6 
function), “opening the loop.” Evidently we could have shifted the first pole in 
the expression (3) and obtained a different opening formula, in which particle 
scattering tree diagrams replace antiparticle scattering. It corresponds to con- 
sidering the loop as going around in the opposite direction. 

If there is more than one loop in the original diagram, the loops may be 
opened in succession. Choose any one loop; that is, integration over any one 
virtual momentum k, leaving the others to integrate later. Then this loop can be 
opened. What results is a diagram sum and integral over diagrams with extra 
particles, but which still has loops remaining in it. However, there is now one 
less loop, and in each remaining loop all the propagators are /, (if equation 10 
is used). Therefore, a remaining loop may be treated in the same way, thus 
reducing the number of loops still further, until there are none left. 

We can understand things very nicely if we look at what we have been doing 
in coordinate space. A single closed loop (without external propagators) can be 
written in the form (the symbol | stands for the space-time positions x,, of the 
point 1, etc.): 


ae | | | Tr [xa(mla (7, 2 — 1)->-x0(3W4G, 2)x22)a(2, Der, 0] 
d*x, d*x,-+-d*x, (11) 


where /,.(2, 1) as a function of the four vector x2, — x, is the Fourier transform 
of the propagator /,(p) for the particle going from | to 2 (all these /’s are not 
identical, because each should correspond to the mass of the appropriate 
particle, but we leave out the superscript i for the sake of simplicity). The 
various y’s are potentials or operators (derivatives, isopin operators, etc.); each 
is, however, local, depending on the one space-time point or a few of its 
derivatives. 
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Thus, 7,(2, 1) is defined as the solution of 
(Co? — m?)I,(2, 1) = 7 84(2, 1) (12) 


where the d’Alembertian operates on variable 2 and 64(2,1) is the four- 
dimensional 6 function d(t2 — t,) 8(x. — x1) 82 — y1) &(z2 — 21), which has 
only positive frequencies for t, > t, and negative frequencies for tg < t,. That is, 


ie eater oer reer 
T,(2, 1) = TE (any e7 HE pltg— ty et Pg — 1 for to > ty 
P 
3 (13) 
= a é e7 Ep, — ta) pig — *y) for to < ty 
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where Ep = +V/P?2 + m?. 
We could also solve (12) with a different boundary condition, say that 
I,(2, 1) is the retarded solution, zero for tg < t,, thus 


(Ch? — mz = i (2, 1), 


~tBpltg— ty) gt tEpltg~t)) iP (xy — x) for tp > ty 


a a 
2 | 2E,Qnp 
= 0 for to < th. 


Then, since 7, and Jp satisfy the same inhomogeneous linear equation the 
difference between them, /,(=1, — Ip), satisfies the homogeneous equation for 
a free particle 


(Ci? — m?)I,(2, 1) = 0 (14) 
and 


for both tg > ft, and t, < fy. 

Now it is evident that the expression in equation (11) vanishes if all the J, are 
replaced by J,. That is because J,(2, 1) vanishes unless t, > t,, so that if all the 
I, were Ip we would get zero unless tg > t1, tg > to,+++ty > ta-isti > ta» Which 
is impossible. We cannot go around a closed ring with time always increasing 
and get back to our starting point. 

If we substitute now Jp = 1, — J, in this expression, equation (11) with all 
I, replaced by Jp, we get a relation between the term with all J, (which is the 
loop L we want) and a number of terms with J,. Each term with J, is evidently the 
scattering by an extra free antiparticle. 

The sign is determined by what direction we choose to go around the loop in 
writing J,(2, 1). That is, draw an arrow around the loop in the direction of 
increasing time used in the original expressions with Jp. The labelling of a 
particle as electron or positron, say, is in accordance with this arrow and the 
proper quantum numbers in the diagram. Thus, for electron-electron brems- 
strahlung scattering in Figure 2a, drawing the arrow in the direction 1, 2, 3, 4, 
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pad vad 
’ 
(a) (b) (c) 
FIGURE 2. 


A closed loop diagram (a) and some of the tree diagrams (4), (c) into which 
it can be broken up. 


we see that the line 1, 2 corresponds to an electron; when opened, it means an 
extra positron coupled at 1, 2, as in Figure 2b. On the other hand, the arrow sees 
the particle from 3 to 4 as a positron and opening this section means an extra 
electron coupled there (Figure 3a). In the case of a particle identical to its anti- 
particle, we still have something to decide for the particular diagram and way of 
opening. In the case above, opening 2, 3 we have the tree 2c to add to 2b and 3a, 
but not the tree 3b (in this way of opening). That is, we want the photon coupled 
at 2 to bring in positive energy wg, and at 3 to remove energy wa, as in 2c and 
not the other way around 3b. This is for opening a particular diagram. What 
happens when we open all the diagrams for a given process we shall discuss later. 

In order to put our equation in a little neater general mathematical form, 
consider first the case of a closed loop formed by a charged scalar meson mass 
m field ¢ coupled to a neutral meson field y via a term | $*(1)x(1)4(1) dr, in the 
Lagrangian. Let it be a loop with n external lines corresponding to neutral 
mesons of momentum q,, dp, etc., and we wish the sum as all the lines change their 
order around the loop (see Figure 4). By choosing this simple example, we will 
clear our minds of complications. 

From what we have said, the loop with J, inside, which is zero, can be written 
as a Sum of terms. The first term is the loop L. The next term with one J, has one 
line or another open and a positive meson coming in. This is a scattering am- 
plitude in the n potentials for a positive meson of momentum P. Let us call 
<P’|T|P> the scattering amplitude of such a meson from P to P’ in the potential 
x (in this case, to n-th order); <P’|T|P> does not contain the 5pp. corresponding 
to the positive meson going right through without interaction. Then if L was the 
sum of the loops with the various y,, x; in any order, it is seen that the first term 


(a) (b) 


FIGURE 3. 
See text. 


873 


874 


362 R. P. FEYNMAN 


FIGURE 4. 
See text. 


has them in every order also. We are to integrate over all momenta P, of 
the incident positive meson, symbolized by Zp. We shall end up by proving 
that 


0 = L — Xp<P|T|P> + 42p,eP|T|Q><Q|T|P> 
— $2p,e,n6P|T|Q><Q|T|R><RIT|P>. (15) 


The first two terms are now clear. The third term is that coming from two J, 
factors. Between these two factors, the various y, are distributed in various ways 
(for example, ya, xc, Xa tO the first, and y,, say, to the second). That is, in 
<P|T|Q><Q|T|P> we imagine the first scattering to be in a field such as ya, Xes Xa» 
and the second in x». That is, the product of two scatterings in a potential cal- 
culated so that the product is of the requisite order in the potentials. The factor 
4 is because putting, say ya, X,. ¥q in the first factor and y, in the second factor, 
is equivalent to putting ya, x-> Xz in the second and y, in the first. 

This relation (15) is the general relation for any loop. If there are quantum 
numbers like isotope spin on the virtual particle, of course these are to be sum- 
med over in the Zp. If a loop consists of several different particles in different 
propagator sections, such as electron | —-> 2, photon 2 -> 3, this is abstractly 
equivalent to some single particle which can change its type from electron to 
photon, etc., under the influence of the external lines. Therefore, if the 2p 
includes a sum also over the types of particle in the ring and the external 
potentials y carry “‘quantum numbers,” which determine how they change the 
virtual particles “quantum numbers,” the formula can be used directly. 


RULES FOR COMPLETE PROCESSES 


We should like to develop some rules for complete processes. That is, we wish 
to add all diagrams for a given process and, if it contains closed loops, try to 
express it in terms of the sum of tree diagrams for other processes. 

We start, for example, with our charged scalar mesons in an external scalar 
potential y. Suppose now that y is a fixed external potential. We know how to 
express the amplitude for scattering with external lines x,, x», etc., in terms of the 
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scattering amplitude S[y(1)] in an arbitrary potential by functional derivatives. 
Viz., the amplitude with three external scalar mesons in and out, ya, x, Xe, iS 


[Jf x@n@x.0) at) SOS dx, d*x, d*x,, (16) 


Therefore, we begin by calculating in a general external potential. 

Evidently (15) is then valid where L is the loop in the external potential, and 
all the amplitudes <P|T|Q> are scattering in this potential. But L is not the 
diagram for a complete process. On the other hand, e” is. It is the amplitude that 
an initial vacuum (that is, no charged scalar meson) will still be a vacuum in the 
final state in spite of the action of the potentials (it is C, in reference [1]). To 
study it, we take the exponential of both sides of (15) and expand it out to obtain: 


l=e 
— Lp<P|T|Pyet 
+42pql<P|T|P><Q|T|Q> + <Q\|T|P><P|T|Q>]e* 
—$2p,a,n[<P|T|P><Q|T|Q><R|T|R> 
+ 3<P|T|Q><Q|T|P><RIT|R> 
+ 2¢P|T|Q><Q|T|R><RIT|P>e* +, etc. (17) 


Each term on the right side is the complete amplitude for some definite process 
= amp vac to vac, 


<P|T|P>e* = amplitude that an initial state with one positive meson ends in the 
same state (forward scattering of one meson in the potential), assuming it 
interacts, (Remember, <P’|T|P> does not contain the direct Spp- term). We sum 
over all values of P. 

[<P|T|P><Q|T|Q> + <Q|T|P><P|T|Q>]e* is the amplitude that two positive 
mesons in the initial state, P, Q, are both scattered forward correctly, including 
exchange for the Bose particles (so the final state is <P|<Q| + <Q|<P|). The 
factor 4 normalizes the states or, put another way, the 4 connects the sum on 
both P and Q to the sums over each pair P, Q counted once. 

The next term is an analogous forward scattering from the state of three 
positive mesons with momenta P, Q, R to the same state, again including 
exchange. 

Thus we have for charged bosons in an external potential: 


1 = <no meson|no meson> — = <forward scattering of one positive meson) 
+ 2 <forward simultaneous scattering two positive mesons» 


— = <forward scattering three mesons» +, etc. (18) 


This is an interesting relationship between amplitudes for various real physical 
processes. How general is it? Obviously it applies as well to scalar external 
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potentials, like y, as to vector external potentials, as in the electrodynamics of 
charged scalar mesons. And the charged mesons could be other spin than zero 
also, if you like. It will not work directly if the mesons of the loop are neutral, a 
case to be discussed below. It will work if the mesons of the loop are not all the 
same kind and the propagators are successively of different masses and spins, 
But what is essential is that in every state the particle is distinct from the anti- 
particle—otherwise the equation is meaningless unless the phrase “positive 
extra meson scattering forward”’ has content. For if we have two such mesons in 
the third term, say, we do not wish them to be able to annihilate each other. 

There is a corresponding theorem for Fermi particles, Consider electrons in an 
external potential (for example, electrodynamics). Equation (15), which opens a 
loop, is general and does not depend in any way on statistics. It works for spin 
4.as well as any other; it even works when parts of the loop are spin 4 and other 
parts have other spin (as we have seen). It is therefore valid for electrons in an 
external potential. 

The amplitude to go from a state with no electrons or positrons to the same 
state is, however, e~* for Fermi statistics, rather than e+” for Bose. Let us 
expand the exponential of the negative of each side of (15). We get 


1 = e7* + Xp<P|T|Phe~* + 32p,o(<P|T|P><Q|T|Q® 
— <PIT|Q><Q|T|P>)e~* + $2p,0,2((P|T|P><Q|T|Q><RIT|R> 
— 3¢P|T|Q><Q|T|P><RIT|R + 2¢P|T|Q><Q|T|R<RIT|Py)Je“™. (19) 


It is evident that the terms represent scatterings, with exchange, into anti- 
symmetric states, just as expected for the Fermi statistics. We have then, in the 
Fermi case, 


1 = <no positron |no positron) 
+ Xp <forward scattering of one positron) 
+ & <forward scattering of two positrons» 


+ 2 <forward scattering of three positrons), etc. (20) 


Since with Fermi particles we can always distinguish particle and antiparticle, 
(19) is valid for any Fermi particles, charged or neutral, or having other indices 
such as the octet of nucleons (since the antiparticles, called positrons here, 
would be the distinct octet of antinucleons) in scalar or vector meson external 
fields, etc. 

Equation (19) is easy to demonstrate directly. We take the case of electrons 
and positrons. Consider a state with no electrons in either the positron energy 
states, and in the old view of Dirac, none in the negative energy sea either. There 
being no electrons with which to interact, the external potentials would have no 
effect—the amplitude to remain in this same state is unity. Put in more modern 
language, consider an initial state which I will call “Dirac vacuum,” which con- 
sists of no electrons present, but every possible positron state occupied. Then a 
little consideration will show that an external potential can do nothing. It 
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cannot scatter an electron—there are none. It cannot scatter a positron to another 
state—they are all full. It cannot annihilate a positron, for no electrons are 
present to annihilate—and it cannot create a pair, for all the positron states are 
already occupied, Thus, the Dirac vacuum remains undisturbed, the amplitude 
for it to do so being 1, the left side of equation (19). 

But we can calculate this same amplitude another way, for it is well known 
that we can disregard the exclusion principle in intermediate states, as long as we 
are careful to make either the initial or final state antisymmetric. 

Now, this amplitude, that the Dirac vacuum remains so in a potential, can be 
analyzed in the following way (following the conventions of reference [1]). 
There is some amplitude that none of the positrons interact with the potential 
and all go right through; this is just e~“. Or one positron, say of momentum P, 
may be scattered by the potential, the rest not. If the others are not, the final 
state of this positron must be the same in spin and momentum. The amplitude 
for this is <P|T|P>e~*, but we must sum over all the positrons which can do this 
(sum over two spins and all three-momentum). Again, perhaps only two of the 
positrons interact, the others not. If they are of momentum P, Q, they must 
either have these same momenta or else they interchange states. This leads to the 
third term, etc. 

We could have started with this simple derivation of (20) and then taken 
logarithms to obtain (15), and finally argue that (15) is valid without respect to 
statistics in view of its nature involving only one loop. But this close connection 
no longer surprises us when we realize that the propagator appropriate to the old 
Dirac view of electrons in negative energy states is just Jp. That is, in each case 
we compare an amplitude calculated with Jp to one calculated with /,. 

I have not found a derivation for the Bose equation (18) as simple as the one 
just presented (using the Dirac vacuum) in the Fermi case. 

It is evident, of course, that although we have used extra positrons in writing 
equation (20), we could just as well have used extra electrons and obtained 
another formula with electron scattering. It is not easy to find a simple expression 
in which the Xp can mean sum over electron and over positron states both. Just 
adding 4 of equation (20) with positrons to 4 of equation (20) with electrons will 
not do. The term in the scattering of two particles will not include the scattering 
of one electron and one positron. 

Thus, I have not found a simple expression for processes in the case of neutral 
Bose particles (I mean a particle equivalent to its own antiparticle, like photons). 
It is to be remarked that the vacuum to vacuum amplitude is e“/?" because in the 
expression L in equation (15) I mean the loop calculated considering an order 
of potential interactions a, b, c,..., d to be distinct from the order d,..., c, b, a 
(as for charged bosons), while for neutral bosons there is no distinction of going 
around the loop one way or the other, and the contribution is 4 of the expression 
used in (15), In any perturbation expansion it is possible, if artificial quantum 
numbers are put on the particles, to keep track. An expression for processes with 
neutral particles would be very useful. (We emphasize equation (15) is valid in 
any case—we are discussing the extensions to processes, analogous to equation 
(19) or (17).) Such an expression could (like (15)) be immediately generalized 
to any situation at all, with loops containing particles of different kinds. 
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How much more general can we make (19)? First of all, we can differentiate 
both sides with respect to the external field to any order to obtain relations 
valid in perturbation theory—but all such results are probably obtained more 
easily and in greater generality directly from (15). 

But (18) and (20) apply also to interacting particles. For example, the ampli- 
tude T,2[A] for a process with particles mutually interacting via electromagnetic 
fields can be obtained from the amplitude 7,[A] for the same process in an 
external potential by suitably summing over various functions A. Explicitly, 


Ta[A] = | To[A + BleS® DB (21) 


where S(B) is the action of the electromagnetic field [2]. Hence, performing this 
operation on both sides of equation (20), or (18), we find the relation valid for 
interacting particles. An expression such as (19) is no longer valid, because the 
two-particle scattering can no longer be written in the form of independent 
products <P|T|P><Q|T|Q> — <P|T|Q><Q|T|P> when there is interaction. 

In the expressions found in this way, closed loops of interacting fermions (or 
charged bosons) can be re-expressed in terms of scatterings without such loops 
(that is, the diagrams associated with charge renormalization, photon-photon 
scattering, etc.). However, all closed loops are not thereby eliminated, as 
diagrams such as Figure 2a, or self-energy diagrams, etc., are not opened by this 
process. 

It is also clear from the argument for (20), starting from the Dirac vacuum, 
that the equation is equally valid if the particles mutually interact. 

We might ask if we can generalize (17) by aiming not for the vac vac amplitude 
(e*) but for a specific amplitude for antiparticle (or particle) scattering, say for a 
particle to go from state x to state y. The true amplitude for this is <y|x>e. But 
if we calculate it by multiplying <y|x> by e” from (17), we get 


<y|x> = <y|xde" — Lp<y|x><P|PdeX +4, ete. 


The left side is a simple tree diagram, but the right is not a sum of processes, 
The first term is satisfactory, being the amplitude for antiparticle scattering from 
x to y. But the second term is not the scattering of an antiparticle at x and at P 
to one at y and P for the exchange (or annihilation, if x is a particle) term is 
missing. The expression for this scattering should be 


Zp(<P|P><y|x> + <y|P><P|x>)e*. 


It is possible to sum the correct series of processes, but the sum is not <y|x>—it 
is <y|x>pz. We mean by <y|x>, the scattering amplitude for antiparticles at x to 
go to y calculated using the retarded propagator Jp. It is therefore a tree diagram, 
for In permits no closed loops. That is, 


Kyle = <y|xde* — Zp[<P|P><y|x> + <P|x><y|P>]e* 
+ 42p,al<P|P><Q|Q<y|x> + <P|Q<Q|P<y|x> 
+ 2¢P|P><Q|x><y|Q + 2¢P|x><Q|P><y|Q]e* —--- 
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(If x and y are particles, written X, ¥, the new exchange terms are appropriately 
annihilations, etc.) Evidently the terms on the left are scatterings of one anti- 
particle from x to y; the scattering of two antiparticles at x, P to y, P including 
exchange; the scattering of three antiparticles at x, P, Q to y, P, Q appropriately 
symmetrized, etc. 

Evidently the general law for charged bosons making a transition from any 
initial state J, including any number of negative and positive particles, to a final 
state F under the influence of any potentials, or of mutual interaction of any 
kind, is 


{F| Dp = {F|D}. — 2{F, Pil, P}, + Z{F, P, QI, P,Q}, — ete. (22) 


where {F |}, is the amplitude calculated using the retarded operator for any 
propagation of a particle or antiparticle (hence there are no particle loops in 
{F|I}z); {F|I}, is the amplitude calculated in the usual way and therefore the 
one expected in the physical world (there are closed loops in {F|/},). The 
quantity {F, P|J, P}, is the usual physical amplitude for scattering from a state 
|Z, P} with an extra antiparticle in a state of momentum P and some spin, and 
others as in J, to a corresponding state {F, P| with an antiparticle in the same 
spin and momentum state P; but there are other particles in F, except that part 
of the amplitude in which the particle P does not interact with anything is to be 
omitted. The states J, P and F, P are to be suitably symmetrized as required by 
Bose statistics. The sum Xp is taken over all spin and momenta P of the extra 
antiparticle. 

The state |7, P, Q} is the state J with two extra antiparticles and the sum here 
is on all the distinct possibilities—that is, 4 the sum over all states P and all 
states Q. 

For particles obeying Fermi statistics the states are antisymmetrized, of course, 
and the minus signs on the right side of (22) should all be plus. 

The scattering calculated via Jz involves only trees in the case of external 
potentials. The scattering of an electron, say in third order, appears as Figure 
5a, and is nonvanishing only if ts > t2 > t, (in fact, only if 2 is not outside the 
forward light cone of 1, 3 not outside that of 2, and therefore of | also, etc.). 
For positron scattering, the diagram appears as in Figure 5b; again, 1, 2, 3 are 
in time order, but the external positron connects at the time of the latest 
perturbation, the outgoing one at the time of the earliest perturbation. 

Since all these terms represent forward scatterings, dispersion theory may be 
used in re-expressing them further. I do not know to what extent these formulas 
may be practical or useful in the study of strong interactions or in the simplifica- 
tion of calculations in electrodynamics. 

If we desire not expressions for trees in terms of real processes, but real 
processes in terms of trees, we may turn these equations around and write 


UF}, = U|F}, + 2, P|F, Phe + Df, P, Q|F, P, Qhe +, etc. (22) 


for the Bose case, and alternating signs [(— 1)" for » extra particles] for the Fermi 
case. J have found these useful in developing the quantum theory of gravitation 
(and the Yang-Mills theory of vector meson of zero mass). There I had con- 
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FIGURE 5. 
Time relations of the retarded propagator in electron 
and in positron scattering. 


siderable difficulty in finding a prescription for keeping closed loop diagrams 
gauge invariant, but no difficulty whatever with tree diagrams. The relations such 
as (15) or (22)’ helped me to resolve this difficulty, at least for the case of dia- 
grams having one closed loop. The fact that I have not found the general 
expression analogous to (22)’ for neutral mesons (particle, antiparticle identity) 
has made this approach harder to handle in application to larger numbers of 
closed loops. This work will be published later. 


MATHEMATICAL EXPRESSION FOR TREE DIAGRAMS 


In this section we shall express in a more formal mathematical way just what the 
tree diagram for a given process is. We shall not be concerned with showing the 
relation of closed loop diagrams to a sum of tree diagrams. We shall limit 
ourselves to Bose fields for ease of representation; for tree diagrams this is no 
limitation, 

Let there be a number of fields in interaction and call them collectively A, so 
A is a vector (function of position) having components for electrons, scalar 
mesons, photons, or what have you. The Lagrangian for the system is written 
L(A), and is nonquadratic: 


L(A) = L(A) — 1(A) (23) 


where L(A) is quadratic in A and describes all the propagators of the fields, and 
I(A) is of higher than second order in the fields A and represents all the inter- 
actions, 

The full quantum field amplitude for a given process including trees and closed 
loops can be written, as usual, as an integral over all fields B: 


/ = amp with loops = 


Basym= Ao 


(exp i | L(B) dr) OB. (24) 


How in (24) do we represent the external particles? They are represented by a 
number of free particle waves A,, A2,... coming in and going out. We want 
each of them to act once. Thus, we form, with arbitrary coefficients «1, a2,...; 
the expression 


A, = > uA, (25) 


Closed loop and tree diagrams 369 


and perform the integral in (24) subject to the condition that asymptotically B 
(at +oo and —0oo) equals A, (representing asymptotically waves coming in for 
t—> —oo and going out at t = +00), Then the amp in (24) depends on «, as, . . . 
The amplitude for the process with one particle coming in with wave A,, one with 
wave Ap, etc., is just the first order action of each potential and is the coefficient 
of a ag03... in this expression (24). If external potentials are acting, they are put 
into A, and not differentiated (p. 337). 

To be a little more explicit, let us define R(A) as the first functional derivative 
of L(A), and R,(A) likewise of L2(A), 


R(A) = a( | L(A) ar) / 54 = R,(A) — K(A), (26) 
R(A) = a( | L(A) ar) / 5A, (27) 
K(A) = a( i 1(A) ar) >A. (28) 


R, is then simply a linear differential operator, the “free particle propagator” 
whose reciprocal we shall write as Rz+ and define precisely so that at tf = +00 
only positive frequencies are propagated, and at t = —oo only negative fre- 
quencies. Thus, if R. = []? — m?, or q? — m?, in momentum space, Rg? = 
(q? — m? + in)~}in the limit as 7 ~ +0. What happens when R, has no inverse 
due to a gauge group, such as in gravitation, will not concern us here. It presents 
no serious problem for trees, but it does in (24). For details in this case, see my 
other contribution to this volume (p. 377). 

The asymptotic free particles each satisfy R,(A;) = 0 or in total Ro(A,) = 0. 

Another useful mode of expression which resolves certain ambiguities of 
surface integrals at infinity is to suppose that A,, etc., were not strictly free but 
came from certain classical sources s,, etc., at infinity, or else to suppose A, to 
be arbitrarily cut off near infinity and define R.(A;) = s;. Thus, we write 


A, = Rzs (29) 


where s = >, a5, is the effective source at infinity. The amplitude with all loops 
can also be stated as 


-{ 


Now, s could be considered finite if there were external classical potentials 
acting, but if we have only quanta coming in and out we shall have to take only 
first order in all the «, in s, thus s is infinitesimal. 

All this is standard field theory. 

We desire now to express, for the same problem, defined by the same A,, 
equation (25), or better yet, by the same sources s, the amplitude for trees (no 
loops). It is 


exp i | (L(B) — sB) dr DB. (30) 


asym = 0 


T = amp with no loops = expi | (L(A) — As) dr (31) 
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where A satisfies the equation 
R(A) = s. (32) 
This equation (32) has the explicit solution 
A = Rz(K(A)) + Ao. (33) 


If we wish to avoid speaking of sources s, we should say the amplitude with no 
loops is exp i J (L(A) — AR(A)) dr where A makes J L(A) dr extremum subject 
to Aasym = A,, and therefore given by (33). We should have to be very careful 
with ambiguous integrals like | AR,(A) dr, which must be interpreted as 
[ (AK(A) + A,Ro(A)) dr. 

The result (31) may not be entirely obvious but we outline how it may be 
derived. If one considers how a connected tree diagram is formed with a set of 


lines A,, Ag,... going in, and one extra F, coming out, it is clear that we are 
constructing an A” being the first order in «,, &,... of A’ where A’ is a solution 
of 

A’ = Rz\(K(A)) + > Ay (34) 


as can be seen by solving (34) by iteration and noticing term by term how the 
correct diagrams are formed by the equation, the final amplitude being 


¢2 | FK(A’) dr. 
If we imply that the correct order (first in all a,) is to be taken, we may write this 
= | FR,(A’) dr. 
Using the method of sources, so we can integrate by parts, this is 
t= [ sea’ dr 
where Sy is the source of F or sp = RQ(F). 
To get a more convenient and symmetrical form, we would like an expression 
not in terms of A’ but in terms of A, which comes from (34) with the extra term 


$F added to the > «,A,, and later take the derivative with respect to ¢ at d = 0. 
Since we now go to first order in ¢, we can write 


pus @ | se dr (35) 


(the change from A’ to A brings only in a ¢? term), That is, upon varying a 
source by a perturbation 4s, the change in t must be 


ee | (88)A dr. (36) 
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This is solved! by 
pa i (L(A) — sA) dr (37) 
because its first variation is 
se | (8L/8A) — s) 8A dr + | SA de 


where 5A is the change in A induced via equation (32) by the change 4s in s, 
and the first term vanishes by the equation (32) determining A. When it is 
considered that for the S matrix we want disjoint as well as connected diagrams 
the complete sum becomes the exponential (31). 

The analysis of tree diagrams is very close to classical field theory, for it 
involves only the solution of the classical equations of motion (32). This is done, 
however, via (33), with complex boundary conditions. As closed loops are 
expressible in terms of trees, this suggests another way to get from classical to 
quantum field theory without ambiguity. 


CONTRIBUTIONS FROM SINGLE LOOPS 
For formal analysis, the trees may now be separated from the diagrams with 


loops by substituting B = A + D in (30) to get (note, in the system with 
sources, Agsym = 0) 


| 


factoring out 7, the amplitude for trees from equation (31) we find 


(exp i | (L(A + D) — s(A + D)) ar) 9D (38) 


asym =0 


feF | (exp: i (L(A + D) — L(A) — R(A)D) ar) QD 
where the extra factor f represents the contributions from loop diagrams only is 


f= | (exp: | L,(D) ar) 9D (39) 


where the field D goes vacuum to vacuum, all reference to sources or asymptotic 
fields is contained in the effective L,(D) Lagrangian for motion of D in an 
external field A, 


L,(D) = L(A + D) — D(8L(A)/8A) — L(A). (40) 
The Lagrangian L,(D) is purely of second order and higher in D, We define 


R,(D) = 8L,(D)/8D = R(A + D) — R(A), 


1 The connected diagram with a total of lines in and out can also be written as: the first 
order in a; to a, of (n — 1)~1 JL(A) dr. For if, in (35), we consider F as the line 41, we get 
t = a, J s1A dz, likewise for Ag, etc., so summing, nt = f sA dr. 
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an operator which has a linear part, coming from the quadratic part L4.(D) of 

L,(D), which we call R4o(D). Thus, R42 is a linear operator (depending upon the 

form of A) having an inverse Rj, defined with the usual quantum boundary 

conditions. That is, we can formally study all problems of opening loops with 

external lines, by considering what happens when opening a loop without 

external lines, but with the mathematically more complex propagator R7?. 
Expanded only to second order, we get 


fe =| (exp i | LD) ar) QD, 


which gives the usual closed loop expression, simply single loops with no 
external lines, propagator Rj); that is, f2 = e*, L = trace (In Ryo). 

This propagator, say, (Rj2),, to be more explicit, can, by redefining its poles, 
be represented as in equation (4) as a retarded propagator (R32), plus a free 
particle part. That is, x = (Rj), represents a solution of R4.x = s with no 
field asymptotically in the past, while (Rj)).s = y represents a solution of 
Raoy = 8 with quantum boundary conditions. Their difference x — y, satisfying 
Ry(x — y) = 0, is a solution of the homogeneous equation for antiparticle 
scattering (in the external potential, of course). We thus symbolize equation 


(Rid). = (Rid), + free scatt. (41) 


by the diagrammatic symbolism 


fe 


= (42) 


zy 


The retarded propagator is represented by a line with a double arrow in the 
direction of the latter terminal. The single arrows on the external lines indicate 
that they carry positive frequency in (at the lower terminal) or out (at the upper 
terminal). Therefore, for the first order perturbation of one closed loop, we have 
(the dot represents the perturbation) 


Cc : C) Lo (43) 
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The loop with the arrow representing the retarded propagator vanishes, for final 
termination time must exceed the initial time, and cannot be equal as the loop 
requires. Thus, one loop is equivalent to one extra particle scattering. 

Opening such a loop gives simply a straight line for a simple particle prop- 
agated from past to future under the influence of the potential 4. That is, the 
extra particle is in interaction with all the physically real external particles. In 
this case, the diagram when one loop is opened corresponds to a definite order of 
a complete physical problem, namely the forward scattering of one extra particle 
under the influence of all the others. When two connected loops are opened, the 
result is not always exactly the same as a definite physical problem as we shall 
see in the next section. 


OPENING TWO CONNECTED LOOPS 


As we have seen, we now need only to study loops with no external lines, but 
with propagator R33. If we expand the operator L,(D) in (39) to still higher 
order, 


LD) = La2(D) + Las(D) + La(D) +--+ (44) 


FIGURE 6, 
3 + 4 OH) The contributions of second order 
in Laa(D). 


we shall obtain in the next order the diagrams of Figure 6. We also obtain from 
L,4,(D) a diagram in the form of a figure eight but when we deal with those in 
Figure 6 it is immediately obvious how to include the figure eight. Each junction 
in Figure 6 represents a coupling of three lines via L43(D) and each line represents 
an action of a propagator Rad. 

Using (42) we can transform the first diagram of Figure 6 as indicated in 
Figure 7. The closed ring of propagators in the second line of Figure 7 vanishes 
for it contains a closed ring of retarded commutators. In Figure 7 xtra particles 
of different asymptotic momenta are indicated by different kinds of lines: dashed 


FIGURE 7. 
+ ai + (Ty Opening a double ring. 


b- 
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FIGURE 8. 
Reduction of the Figure 6 diagrams to trees. 


or straight. Opening the second diagram of Figure 6 is direct via (43), so we have 
Figure 8. The last diagram in Figure 8 is an entirely disjoint diagram, consisting 
of two factors, one being two extra particles scattering into a third, and the other 
its inverse. 

Incidentally, we can check unitarity by noting that the imaginary part of the 
propagator (R~+4), is 4 with retarded poles, 4 advanced, or 


Im 


We can then easily verify from the last line of Figure 8 that the imaginary part of 
the diagrams of Figure 8 is just 


N 
VY O 
i as Ae 
3 2 
Ay . 0 
\ 
as required by unitarity. 


The combination of the diagrams A, B, C, in Figure 8, namely 3A +4B+4C 
is not what an ordinary process would give. If we simply draw all the diagrams to 
represent the scattering of two particles on each other in this order, we obtain 
diagrams A, B, C with equal weight, A + B + C. It is true that if we imagined 
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the extra particles to carry certain quantum numbers or charges, then by making 
rules about what quantum numbers the intermediate propagator can carry the 
diagrams can be selected (for example, if the extra particles each carried a 
different charge and the intermediate must be neutral, only C survives). It is also 
true that in many cases such rules exist which lets the combination in Figure 8 
remain as a true process. However, for completely neutral objects like gravitons 
the combination of Figure 8, found on opening the diagrams of Figure 6, is not a 
true process. Therefore, an attempt to formulate quantum gravitational rules by 
first saying how trees work and then using them to calculate closed loops is not 
completely straightforward. That is because a property, like gauge invariance, 
which is not valid for each diagram alone, but is valid for each set of diagrams 
representing a complete process, may not be valid if the tree diagrams are 
assembled into closed loops carelessly. The application of the methods in this 
paper to the problem of the quantum theory of gravitation appears in the 
following paper. 
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RICHARD P. FEYNMAN 


Problems in Quantizing the 
Gravitational Field, and the 
Massless Yang-Mills Field 


INTRODUCTION 


Some years ago I started extensive work on the quantum theory of gravitation, 
studying difficulties in formulating it, as well as detailed analysis of its divergence 
and renormalization characteristics. However, I became involved in other inter- 
ests and the work was never completed. Many people, including Professor 
Wheeler, have asked me what I found. I shall not discuss the renormalization 
work here, but I will take this opportunity to publish the material I have, incom- 
plete though it may be. Therefore, in this paper we shall discuss some problems 
which arise in attempting to formulate a quantum theory of gravitation. I have 
heard that these problems have been solved by others, but I will report how 
the problem looked to me some years ago, correcting, however, some errors in 
the equations for the gravitational case which I discovered while preparing this 
manuscript. 

The questions about making a “quantum theory of geometry” or other con- 
ceptual questions are all evaded by considering the gravitational field as just 
a spin-2 field nonlinearly coupled to matter and itself (one way, for example, is 
by expanding g,, = 6,, + A,, and considering h,, as the field variable) and 
attempting to quantize this by following the prescriptions of quantum field 
theory, as one expects to do with any other field. The central difficulty springs 
from the fact that the Lagrangian is invariant under a gauge group, and therefore 
the propagator is singular and generally undefined. In the classical theory and 
in quantum electrodynamics this problem may be evaded by starting with a 
modified Lagrangian which is no longer completely gauge invariant (for example, 


1 For related references, see note added in proof at the end of the article. 
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in electrodynamics add a term like (V,A,)? to the Lagrangian). This leads to a 
nonsingular propagator (for example, to 6,,q? in electrodynamics) but because 
of conservation laws that the sources satisfy (a consequence of the original gauge 
group invariance) it can be shown that the extra term added to the Lagrangian 
has no final physical effect. In the quantum gravitational theory, adding to the 
Lagrangian a term of this kind permits a definite prescription for calculating all 
diagrams. It is found that all tree diagrams (diagrams which involve no closed 
loops) are, like the classical theory, completely satisfactory. However, diagrams 
with closed loops are affected. In particular, if a diagram with one closed loop 
is compared to its related tree diagrams by unitarity (or by the loop opening 
processes of reference I?) a discrepancy is found. This discrepancy can easily be 
removed by a further modifying rule—that for each closed ring of gravitons 
(calculated with the modified nonsingular Lagrangian) a special term must be 
subtracted. This special term is describable as the contribution of a correspond- 
ing closed loop around which, in place of the graviton, goes a special auxiliary 
vector particle (propagated in a definite manner and coupled to gravitation but 
not directly to matter). 

When a diagram contains two contiguous rings (like the letter 0) I have not 
been able to find the corresponding rule. Part of the difficulty here is to find an 
unambiguous definition of what the double ring should equal in a correct theory. 
The main purpose of this paper is to discuss various attempts to resolve these 
difficulties and to describe some incidental relations that have been found in 
attempting to do so. It is not even clear that the problems are serious, or are not 
easily solved, as I have fallen into calculational confusion. 

The algebraic complexity of the gravitational field equations is so great that 
it is not easy to do exploratory mathematical investigations and checks. Gell- 
Mann suggested to me that the Yang-Mills theory of vector particles with zero 
mass also is a nonlinear theory with a gauge group and might show the same 
difficulties, and yet be easier to handle algebraically. This proved to be the case, 
and thereafter, all the work was done first with the Yang-Mills theory and then 
the corresponding expressions for gravitation were worked out. The corres- 
pondence is exceedingly close. Each difficulty and its resolution in one theory 
has its corresponding difficulty and resolution in the other. It becomes obvious 
that to find a completely satisfactory quantization of the zero mass Yang-Mills 
field, is to find a completely satisfactory quantization of the general theory of 
relativity. 

For this reason, all the work in this paper will deal primarily with the Yang- 
Mills theory. The correspondence of the equations in the gravitation theory will 
be developed in an appendix. 


YANG-MILLS FIELD THEORY 


The source of gravitation is energy and momentum, quantities which are locally 
conserved. The gravitational field carries energy and momentum and hence must 


2 See the previous paper “‘Closed Loops and Tree Diagrams’? by R. P. Feynman which 
hereafter we call I. 
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be coupled to itself. The source of the Yang-Mills field is isotopic spin current, 
a quantity locally conserved. The field carries isotopic spin and hence must be 
coupled to itself, to make a nonlinear field theory. 

We shall formulate the Yang-Mills theory in the special case that it is inter- 
acting with Dirac matter of spin 4 which carries half integral isospin (that is, 
is a representation of SU,). It is readily generalizable to matter of any spin, or 
of several kinds—but the problems we wish to deal with are not at all affected 
by that. In fact, we find that our problems arise entirely with the field acting 
upon itself alone, the addition of other matter as sources only making a small 
additional algebraic complication, and no new physical problems. Again, exten- 
sion into SU, can be made immediately, merely by a reinterpretation of symbols, 
so to get to our real problems as quickly as possible we suppose matter repre- 
sented by an SU, spinor, Dirac spinor %, in interaction with an isospin vector 
and four-space vector field A, with Lagrangian 


Lnatter > byliV, a TA, + mins (1) 


where 7 are the Pauli matrices for isospin. A rotation in the SU, space induces 
on ¥ the transformation % —> e'* = (1 + it-a)¢ for infinitesimal rotation and, 
on t-A, the transformation e**t-A,e7'** = t-(A, — @ X A,) for infinitesimal 
a, where we define the cross product X between two isovectors A, B, with com- 
ponents A,, B;, 

(A XB), = AniyA:B;, 


where the coefficients of the Lie group A, are defined by 
T,Ty — TiTi = TNs Te- 


(For SU, in the three-dimensional isovector space, this is just the ordinary cross 
product of vector analysis in three dimensions.) Then [t-A, t-B] = it-[A & B]. 
If we now ask that our expressions be invariant for local SU, transformations, 
the quantity a becomes a function of four-space position, and does not commute 
with iV,, but this changes to iV, — t-(V,a). The expression (1) will be invariant 

if A, changes by 
A, > A, — V,a—4«@%x A,. (2) 


Henceforth, our notation will be C,, for V,C and 
C,, = V,C - A, XC (3) 


is a “covariant differentiation” of any isovector C (which may carry other space 
indices, of course). This has the property that C.,, transforms as an isovector if 
C does. We call (2) a gauge transformation. 

Yang and Mills found an expression for a Lagrangian for the field A, that 
would be invariant under (2). The part that contains A, only is 


Ly u(A) = 4E,,°E,, (4) 
where 
E,, = Ay, — Ay. + Ay X Ay. (5) 
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From (3) we find this antisymmetric E,, might also be defined in terms of 
covariant differentiation via 


C.,., — Cr., = —E,, XC. (6) 
The variation of i Lyy(A) dr with respect to A, we call R,(A). It is 
R,(A) = —Eyy:» (7) 
so that from (6) we can deduce directly that 
Ry; (A) = 0. (8) 


This R, is a nonlinear operator, its linear part Ro,(A) coming from the quadratic 
part of the YM Lagrangian 


Loym(A) = 4(Ay,y*Ay,y) — (Ag, u)? (9) 
is 
Ro(A) = Au — (Av,v)u (10) 
or in momentum space 
RoA) = (9? 8uv — Gudv)Ad- (11) 


The operator q? 8,, — q,@y is singular and does not have an inverse to serve as 
a natural propagator. 

We can write the full Lagrangian in interaction as a piece representing free 
matter giy,V,% + ms and a part involving the fields 


L(A) = Lyw(A) — Jus Au (12) 


where J,, the current or sources for the field, comes from matter and is Jty,ip. 
It is then a consequence of the equations of motion of the matter (from varying 


# in [1]) 
YuliV, — TA) = my 
(since they are SU, invariant) that this current is conserved: 
Juz, = 0. (13) 
The field equations of motion upon varying A in (12) are 
RA) = Jy. (14) 


The operator R, is singular and the equation cannot hold for a general J,. In 
view of the identity (8), this equation (14) can only hold if J, is conserved (13). 
To study finite transformations, write ¢ = e’** and a, = t-A,. Then the 
transformation of ¢% is to %’ = of and of an ordinary isovector C(=t-C) to 

C’ is 
C’ = 0Co™} (15) 
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but the field a, transforms in a special way, with a gradient term, SU(2) 


a,’ = o(a, + t,)o~* (16) 
where 
= io~1(V,e) (17) 
with 
Evy = Ay,yyn — AA, + a,a,, (18) 


L= 3 Tr (Euv€uv) 
is an invariant. 
These expressions can be transformed to ordinary notation via the equation 
for commutators of +. For example, equation (15) would become 


C’=C—axC + fax (ax © — pax (ax («x ©), (15’) 


or, in an obvious notation, C’ = e~ °C. The quantity t, is t-T,, where 


T, = —a, —taxXa, —Jax(axXa,) +--- (17) 
or 
T, = (@ X)-X(1 — &**)a.,. 


We shall have to consider two other related theories with slightly modified 
Lagrangians which avoid the singularities of the propagator coming from (4), 
(9). The first we shall call YAM, Yang-Mills with mass, and adds a term 
mA, +A, to the Lagrangian: 


Lyum(A) = Lyu(A) — 3m°A,+A,. (19) 
The equations (14) now become 
RJA) — mA, = J, (20) 
which has solutions for any J,, for 
=m Aya = Susu (21) 
but if the sources are conserved we have (note A,., = A, ,,) 
Ane 0: (22) 


As m? -+ 0 the classical solution of (20) approaches solutions of the zero mass 
equation (14), solved with the subsidiary condition, equation (22). The second 
order operator R, is now, in momentum space (see equation (11)) gq? 8,, — 
9uqv — m? 3,,, an operator which has the inverse “‘ propagator,” 


Prop = —(8yy — qugvim™*)/(q? — m?). (23) 


Thus, it leads to a definite quantum field theory without difficulties (we do not 
discuss divergence problems) if we proceed, in the usual way, by making dia- 
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grams with junctions determined by the nonquadratic part of the full Lagrangian 
and lines with the propagator (23) with m? replaced by m? — ie for « > +0. 
One possible way we shall try to define the theory with zero mass is as a limit 
of this theory with mass, as m? -> 0. The term q,g,/m? in the propagator makes 
it, at first sight, doubtful that such a limit exists, but for trees and single closed 
loops it does exist. 

The subsidiary condition (22) suggests, for m? = 0, another modification 
which we shall call YMD (Yang-Mills plus divergence terms). It is obtained 
from the Yang-Mills Lagrangian with the divergence terms (A,,,)* left out, or 


Lyup(A) = Lyu(A) + 3(Ay,u)? (24) 
so that 
Loymp(A) = $(Az,»)-(Ay,»)- (25) 
The equations of motion now become 
RA) + Avy = Sy (26) 
so that, for conserved sources (13), we have 


Ayvnse = 0. (27) 
This, of course, has a solution 
A, =0 (22) 


and therefore (26) is equivalent to the unmodified equation (14) solved with the 
subsidiary condition (22). 
In quantum theory, YM D leads to a nonsingular propagator 


— 8,,/(q? — ie). (28) 


The couplings are given by the nonquadratic parts of Lyyp (or Lyy, for they 
are the same). This is a definite quantum theory, and agrees with the limit of 
YMM as m? — 0 for trees, but not for diagrams with closed loops. Further 
modifications are necessary. We discuss these later. 

Thus, in classical theory we can define things either by Ly, or by Lyyp, OF 
by Ly um as m? -> 0. All three give the same physical result. In quantum theory, 
Lyy has difficulty in definition because of the singular propagator, but Lyyp 
and Lyyy as m? —> 0 agree for trees because the theory of trees is so close to 
classical theory. For diagrams with one closed loop, Lyyy as m? —> 0 is satis- 
factory and the loop when opened by the methods of J gives the corresponding 
trees. However, as we Shall see, for a closed loop Lyyp is not satisfactory; upon 
opening the loop the correct tree is not obtained, and unitarity suffers. It must 
be modified by subtracting a contribution equal to a corresponding closed loop 
in which a space-time scalar (spin 0) isovector particle propagates coupled to 
A, (but not to other matter). 

The difficulties will be discussed in detail below but some hint may be given 
here. It appears that we must arrange, for quantum mechanics, that the equation 
(27) must not be left as a consequence of a classical equation of motion, but 
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must dynamically come from a Lagrangian also. If P is a spin 0 isovector field, 
the Lagrangian 


Lp = —4P.,-P., (29) 
leads to the equation of motion 


Pa. + P zu) = Pasa — FA, x P = 0. (30) 
If we set 
P=A,,» (31) 


equation (30) reduces to (27), so if P = A,,, is true initially, it will be always, 
and it appears that the addition of 4(A, _,)? to Lyy must be compensated by the 
subtraction of Lp. 


QUANTUM THEORY 


There are many ways suggested to pass from classical to quantum theory. The 
one we shall use, to study the difficulties, is the path integral method which says 
the quantum mechanical amplitude for an event arises from the path integral 


ape | exp E | L(A, #) ar|94o4 (32) 


over all fields subject to special boundary conditions, where L, is the total 
Lagrangian and the integral is over all fields, symbolized by DAMy. If the 
matter field is a Dirac field suitable modifications are made either by using 
operator calculus, or by changing certain signs in final diagram expressions or, 
if you prefer, using scalar matter for an example or no matter field at all. In 
this paper we are concentrating on the real difficulty, which lies in the properties 
of the Yang-Mills, or gravity Lagrangians, and a detailed study has shown that 
addition of matter makes no new problems, provided its Lagrangian is invariant 
under the correct transformation (2). Therefore, let us leave the matter field out, 
for later integration perhaps, and simply study (32) with (12) substituted for 
Ly(A, $): 


amp = | exp E | [Had = JA.) ar|o4. (33) 


Expression (33) is meaningless, however. This is because there is a direction 
in which A can be changed which does not alter the integrand, namely the 
change (16) [in first order, (2)]. Thus, when moving A in this way, the integral 
is infinite. It is analogous to a multiple integral over many variables over an 
infinite range but the integrand does not depend on one of the variables. What 
the author did at first was try to ignore this difficulty and to proceed to compute 
a number of processes in detail (analogues of Compton effect, bremsstrahlung, 
radiationless scattering, vacuum polarization, and second-order scattering with 
scalar matter interacting with gravitational fields or Yang-Mills field quanta 
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instead of photons). The aim was to see if any difficulties arose, and to study the 
convergence problem, which will not concern us here. The difficulties which 
did arise having to do with the gauge invariance were then studied carefully by 
many tedious examples and finally by more formal and powerful methods. It is 
these latter results which will be described here; although they are more difficult 
to understand directly, they do permit a larger overview of the problem. The 
various theorems have all been checked by examples (in fact, in nearly all cases 
the examples suggested the theorems and guided the “‘more powerful methods” 
on what to prove). 

The expression (32) was immediately converted to rules for diagrams. To be 
explicit, a YM field quantum is represented by a polarization four-vector iso- 
vector e,. Its coupling with matter in the Dirac case is t-e,y,, while matter 
propagates with its usual propagator; for example, (py — m)~'. The term of 
third order in A in Lyy, namely (A, X A,)-A,.,, leads to an interaction where 
three field quanta come together. For such a junction with quanta of momenta 
q*, q°, g°, (so q* + gq’ + q° = 0) of polarizations a, b, c, the amplitude contribu- 
tion is 

(q5 _~ qs)b, * (a, x Cy) + (qv =s a})c,*(b, x a,) 
+ (qs ae Gy )ay* (Cy x b,). 


The fourth-order term 4(A, x A,)-(A, X A,) results in junctions where four 
field quanta come together. 

A free quantum entering or leaving has a polarization a, satisfying (from 
equation 10) 


(q°’a, — (qvayqn = 0. (34) 
If it is physical (q*)? = 0 and, further, it is transverse, 
qua, = 0. (35) 


On the other hand, (34) has solutions even if (g*)? 4 0, namely a quantum 
polarized in the direction of its momentum, a pure divergence 


a, = qia. (36) 


Such an incoming field must have no physical effect. The result of substituting 
qi for a, in the sums of all diagrams for a process always turns out to give zero. 
This is an expression of the gauge invariance condition. It also means that for a 
free photon, the polarization a, is not unique, a gauge change a, —> a, + qi@ 
has no effect. Although equation (2) is nonlinear, in the quantum perturbation 
theory it becomes the condition that an extra perturbation of the form (36) 
acting to first order in all the diagrams in which it can act produces zero. 

For the Yang-Mills propagator of the field quanta in the case m? = 0 (YM), 
we replace e,, e, at the ends of a virtual quantum by equation (28), —,,/q?. 
This is not strictly the reciprocal of the singular R,, for R, has no reciprocal, 
but in analogy with electrodynamics it was hoped that this propagator would 
be satisfactory since the sources were conserved, q,J, = 0, and an extra photon 


895 


896 


Quantizing the gravitational and Yang-Mills fields 385 


coupled via a gradient, (36), had no effect. For example, it was expected that 
the —gq,q,/m? term of the propagator with mass, equation (23), would vanish 
for this reason, and the limit m? —> 0 could be taken to lead YMM to this YMD. 

All this was confirmed with tree diagrams, and with trees no difficulties arose. 

A difficulty arose with a closed loop. Its origin can be understood in the 
following way when considering the relation of the loop to the unitarity condi- 
tion or to trees if the loop is opened. Let the g of one propagator of the loop 
be in the z direction and write the —5,,/g? as —e,e,/q? or as 


a5 (Cats eee eee Cee) (37) 


Thus, at the pole of g? we have as necessary, and expected, free quanta entering 
and leaving with transverse polarizations x, y. But we also have an extra quan- 
tum that we do not want, coupled as e, — e, entering, and e, + e,; leaving. 

At first sight we might expect this extra quantum entering as it does via e, — e 
(which in the direction of g, for g, = g,) to give zero. Does not gauge invariance, 
which we have maintained carefully, insure that the action of such a potential 
(as equation 36) gives zero, when diagrams are summed? But pure divergences 
of the type equation (36) give zero only for real physical circumstances when all 
the other quanta acting either have their sources explicit in the diagrams (so 
a, = q,o can act directly on these sources, too), or are genuine external free 
quanta satisfying equation (34), so that they are purely transverse equation (35) 
or else they are pure divergences themselves, as in equation (36). The trouble is 
that the quantum /eaving upon opening (37) is polarized in the direction e, + e, 
or (x, y, z, t) = (0,0, —1, 1), which is neither transverse tog = (0,0, Q, Q) nor 
in the g direction, and has no apparent source. Thus, the last term in (37) does 
not give zero upon opening the loop but contributes a residual. This residual 
is easily calculated and is a rather simple expression, so it was easy to find a 
rule for subtracting a diagram to eliminate it. The rule is formulated in connec- 
tion with equation (29). In perturbation theory we subtract a term corresponding 
to a space-scalar isovector meson going around the loop. It propagates via 1/q? 
and is coupled via (A, % P)-P,,. That is, the scalar meson is neither created nor 
destroyed but two meet at a junction with a vector quantum. With in and 
out momenta g! and qg?, isospin quality p1, p?, they couple to an isovector 
a,(q” = q* + q*) via 

a,+(Pi X P2)(qu + gi). (38) 


They appear only in closed loops, never as free mesons. Where a loop is 
opened, their contribution cancels the unwanted last terms of equation (37). 

For tree diagrams, the scalar mesons do not appear and there is no difficulty 
with the propagator of equation (37). This may be seen, for on opening a line 
for a propagator in a tree in (37) the e, — e, couples to a piece entirely disjoint 
from the e, + e, piece and, therefore, being a gradient coupling it gives zero on 
this piece. 

The questions can also be studied using YMM and the propagator (23), ex- 
pecting to take the limit as m? --0. The reader must be warned, however, in 
doing that the free wave external quanta must first not be taken to satisfy (34), 
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but rather that equation appropriate to the m? # 0 case (namely, equation 34 
with O replaced by m?a, on the right-hand side). If he uses (34) unchanged, the 
limit for trees or one loop will exist, but it will not agree with the aforementioned 
m? = 0 theory. With the proper modified equation (34), the limit again exists 
but is different, and agrees with the aforementioned theory. 

We cannot here go further into these explanatory details via perturbation 
theory, which serves so admirably as a laboratory to learn about these matters, 
but will now go directly to a more general and abstract mathematical way of 
dealing with them. First we discuss the theory of trees, and can follow I directly, 
except to correct small details, for R, has no inverse. Next we discuss one loop, 
and prove a relation between YMD and YMM plus a scalar meson which 
verifies our statement about one such loop. Finally we say what we know about 
double loops. 


TREES 


In this section we discuss tree diagrams in the YM theory and show first that 
the obvious prescription for writing them down using the propagator —68,,/q? 
gives results which are completely satisfactory and gauge invariant. Next we 
show that the theory with mass, for which no ambiguities arise, yields as m? —> 0, 
the same result as the previous — 8,,/qg? propagator. In the next section we will 
discuss problems connected with the trees expected from opening closed loop 
diagrams. We shall use the notations and ideas of I in our discussions. 

Let us formulate the theory of tree diagrams if we use the propagator 
—8,,/q? which we symbolize as Rp to analyze the Lagrangian (12), (1). We 
separate R,(A) into a second order (in A) part Ro,(A) of equation (11) and a 
part of higher order in A, 

R,(A) = Ro,(A) — K,(A) (39) 
and 
Ro,(A) = Rop,(A) — (Av, yu- (40) 


This Rgp,(A) is just A, yy, of course. Then the construction of trees with Rap 
as the propagator means solving by iteration the equation 


A, = Rzp{K,(A) oe J} + A; (41) 
where 
Ji, — pry, (42) 
and an accompanying equation for matter 
p= (iy.Vu = m)~ {y,(e+A,)$} + ° (43) 


where A° and ¢° are appropriate (see 1) asymptotic functions representing the 
incoming and outgoing external lines, satisfying 


GyiVu — m)b® =0 
and 
Ro,(A°) = RepA°) — (AY,,),. = 0. (44) 


898 


Quantizing the gravitational and Yang-Mills fields 387 


The resultant tree t, can then be expressed in terms of A, J. It is easiest to express 
the effect on it of one extra disturbing external line, say a field F, acts on it in 
first order as 


ee | F,-(Ki(A) + J,) dr. (45) 


Equation (43) means that ¢ satisfies the equation of motion for its field (just 
below equation 12) and hence, its current equation (42) satisfies (13). Multiplying 
the equation (41) by Rep and using (39), (40), and (44) shows that A satisfies 


R,(A) = J, + Ay yu — 9 ee (46) 


This appears, at first sight, not to be the expected equation (14), but to carry a 
residual D ,, where we write 


D =A,, — A°,. (47) 
However, taking the covariant derivative of each side of (46), we find 
Dasa =0= Diu. A Ay x D, (48) 


which we are solving (as one readily finds by taking the divergence of equation 
(41)) by the iteration 


D = R5A{A, X D,,}. (49) 


Now asymptotically A, = A?, so that asymptotically D = 0, so that (48) says 
D = Oeverywhere (it has no source), so equation (46) implies that A does satisfy 
the expected equation (14). It does so with the special condition A, = A? ,. 
If, for example, A° were always chosen to satisfy Af, = 0, then A, , would be 
that solution of (14) satisfying the supplementary condition (22). (Of course, no 
restriction need be made on A?_,, so none exists on A, ,.) 

This is satisfactory but to be complete we must now show that the tree value 
does not depend on A? if it is changed by a gauge transformation. We need only 
prove this for arbitrary first order changes A? — Af + V,a. That means only 
that we must show that a pure gradient external line (36) has zero coupling. 
This is easiest done by letting this line be the last line F, in (45) to get 


se | V,0-(K,(A) + J,) dr. (50) 


Integration by parts and noting that equations (8) and (13) imply (since 
Roy u = 0) 
K, AA) + Siu = A, x (—R,(A) + J.) 


so that in view of (46) and (47) we get 
She | a-(A, x D.,) dr =0 (51) 


since D = 0. Therefore, the theory is invariant under gauge transformation, and 
pure divergence external lines have no influence. 


899 


388 R. P. FEYNMAN 


The reader may be curious to know, when A? is changed to A? + V,a, what 
happens to A,. We do not require this for our further studies, but it is interesting. 
It is simply changed by a gauge transformation (/ is also changed in the same 
transformation) and becomes A, + V,8 — A, X B = A, + B.,. The condition 
that D = 0 or A,,, = Af, becomes Byy jy — & yu, — (Ay X B),, = 0 which is 
solved by iteration as 


B = Rep{(A, X B),,} + & (52) 


since asymptotically 8 = a. This equation has a kernel adjoint to (60). 

Alternatively, we could define the trees as those calculated for YM™M in the 
limit m? +0. For YMM we add —m?A,-A, to the Lagrangian so the propa- 
gator becomes unique. Thus, the expression for trees from (1) is unambiguous, 
we solve (43) and, instead of (41), 


A, = (Rep _ m?)~*(8,, a m-?V_,V,){K,(A) + J,} + A? (41’) 
where A? now satisfies 
Rz,(A°) — m?A? = 0, (44’) 
which implies that 
Al, = 0. (53) 


Now equation (41’) implies that A satisfies 


R,(A) = J, + mA,, (46°) 
which implies A,,, = 0. 
In view now of the relation preceding equation (51), whose left side vanishes 
from (46’), since A, X A, = 0, the gradient terms in (41’) vanish and we could 
have used instead 


Ay = (Rep — m*)~{K(A) + Jy} + Ap. (41") 


This now has an evident limit as m? — 0 (in equation 41’ the limit was not 
evident because of the m~?). This limit is (41) with the supplementary condition 
(53) and therefore (14). We have already seen this supplementary requirement 
does not destroy the gauge invariance. 


SINGLE CLOSED LOOP 


We have seen that the single closed loop calculated (for m? = 0) with propagator 
R52 does not agree with expectations if the loop is opened. This is because the 
propagator generates some new unwanted terms according to (37). There 
(e, — e,(e, + e) can be represented as a scattering in which a particle of 
momentum q, (now g? = 0) comes in polarized in the direction g, and goes out 
with the same momentum but polarized in the direction N, where N, is a four 
vector—(00 —1, 1) in our case—of zero length such that g-N # 0. We shall 
assume that the correct theory is what would come if you would just close the 


900 


Quantizing the gravitational and Yang-Mills fields 389 


tree diagram with transverse quanta. If we call this loop formed on closing 
trees with transverse quanta “‘closed loop YM,” we have 


Closed Loop YM = Closed Loop YMD — X (54) 


where X is the result of closing a tree with two extra quanta of momentum gq, 
polarization g, and N,a/(q-N). Closing the tree would be equivalent to calcu- 
lating this extra tree and in place of @-+-a putting the propagator 1/g? (and 
taking the trace of the SU, indices). During all this, the other external lines are 
represented by Ao, etc., in the usual way. We calculate X now. We shall leave 
out the inessential complication of the matter field 4, and take J, = 0. 

Therefore, we are able to calculate the tree X with an external gradient F, = 
V,@ and another external photon, say b,, where 


Daw = %. (55) 


To do this, we can use the formula equation (45) where the A, is to be calculated 
from (41) where A? contains properly all the physically external lines and also 
(in first order, of course) the field b¥. For that reason, equation (44) no longer 
holds but we have instead (defining S,) 


Ro,(A°) = Rz,(b) = S,. 


Since b, is a free wave, g? = 0, but it is not transverse so Ro,(b) = q7b, — 
qq, = +b,,,. So that we have from (55) 


S, = @,. (56) 
The equation (41) now leads to a modification of equation (46), namely, 
R,(A) = Jy + Avo — Avi + Sys (57) 
but we recover our original equations if we now put 
D=A,,— AS, +¢ 8) 


so that D asymptotically is « this time. Now our tree (50) becomes (following 
the steps to equation (51)) 


x= | a-(A, x D,,) dr. (59) 
We see from taking the covariant derivative of (57) that D satisfies (48) but that 
it is being solved by the iteration 

D = Rzp{A, X D,,} + 4. (60) 


It is evident from inspection of (59) and (60) that X is simply the tree that would 
result from a closed loop diagram of a field P coming from the Lagrangian (29) 
or P,P, + P-(A, X P.,), thus leading to the propagation equation (60) and 
the closed loop (59). This is most obvious if we choose the gauge A?,, = 0. 
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It is also readily seen that YMM leads to the same thing, as m? 0. The 
easiest way is first to calculate the closed loop via the propagator of YMMD, 
namely (q? — m?)~ 8,, and show that it exceeds that calculated by 


(q? - m?)~*(8,y me Guqy|m?) 


by a term which is just what you would get if a scalar particle went around a 
loop with the Lagrangian 


L= -4?,,-P,, — m’P-P); (61) 


therefore, with propagator (q? — m?)~! and coupling P,,-(A, X P). In all these 
new propagators, the limit as m? —- 0 may be taken. You can do all this arguing 
from tree closures just as before or by more formal methods. Because these 
latter methods may be interesting in further investigations, we describe them 
in the next section. 


FORMAL THEORY OF SIMPLE LOOPS 


We will prove in a more formal way that in diagrams containing up to one loop, 
if the propagator 4,, is used, we must subtract the action of a particle acting 
under the action from the Lagrangian (29). The easiest way to proceed is to 
prove the corresponding theorem for YMM, the Yang-Mills theory with mass, 
and then to take the limit as m? approaches zero (which limit exists for one loop 
diagrams). We shall write CLYM™M to represent the expression for one closed 
loop, for YMM, that is with propagator (23). We have shown, in the previous 
paper that the contribution for all closed loops is given by (see I, equation 39 
and 40) 


| exp {i | [L(A + D) — D8L(A)/3A — L(A)] a:\aD (62) 


where L(A) is the appropriate Lagrangian and we integrate over all D such that 
asymptotically Dosym —> 0. To get the term corresponding to only simple closed 
loops, the expression in the exponent must be expanded to second order in D 
and higher terms dropped. This is evidently just the terms of second order in 
D in the expansion of L(A + D)in powers of D. Call it L’(D), a quadratic func- 
tion of D containing A implicitly. For example, for the Lagrangian (19), a direct 
substitution of A, + D, for A, and selection of the second order terms shows 
that in this case 


Lyuu(P) = 4F..Fiy + 3E,,-(D, x D,) = $m?D, -D,. (63) 
Here E,, is given by (5), a function just of A, while 
Ei = D,:y = D,., (64) 


with the definition of the covariant derivative being that appropriate to just the 
external field A given by (3). We here explicitly suppose that there is no matter 
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present in addition to the field in order to avoid inessential mathematical com- 
plications. The reader can reinstate the terms representing matter and its 
coupling to the field and verify that they have no effect on the sense of the 
relations we shall prove. Therefore, 


CLYMM = | exp i | Lium(D) dr| QD. (65) 


In the same way for the Yang-Mills theory with mass with divergence terms left 
out, so the propagator is 8,,(q? — m?)~?, the closed loop expression is 


CLYMMD = | exp {i | [Liw(D) + 4(D,.,)7] ar }OD. (66) 


We shall show that leaving out the divergence terms does make a difference, (66) 
is not equal to (65). However, if we subtract from each loop of (66) the action 
of a scalar particle, they do become equal. That is, we shall prove 


CLYMM = CLYMMD/CLSM (67) 


where CLSM is the closed loop from a scalar particle of mass m?; 
CLSM = | exp{ —5 | (P.,P., — m?P-P) ar\9P. (68) 


Since the right-hand side of (67) evidently has a limit as m? — 0 (because all 
propagators are nonsingular), we have proven that the left side does have a 
limit even though the singular kernel makes that less than obvious. We may 
therefore take the right side substituting m? = 0 (that is, CLYMD/CSL) as the 
single closed loop action for the YM theory with zero mass. This is what we 
wished to prove. 

We do not really prove (67) but only that both sides are equal within a con- 
stant (independent of A) normalizing factor (infinite as m? 0), but such a 
constant factor in path integrals can be included in their definitions and has no 
effect on the physics that results from them. We now proceed to a proof of (67). 
First, in (65), replace D, by Dj, and then substitute 


D, =D, + a, (69) 


for arbitrary a. Evidently, 2D’ = YD and F,, = F,, — E,, X@ in virtue of 
(6). Hence, 


Lyuu(D) = Lyym(D) = +(E,, x a)-F,, + (E,,°D, x ay) 
+ +(E,, x a)-(E,,y x a) a 4E,,°(@,, x a.) 


m? 
— me; ,D, — ett att a (70) 
When calculating { Lys1¢(D') dr, however, the third and fifth terms containing 
@., can be integrated by parts, using (7) to produce 


| Li wx(D’) dr = | { ¢um(D) — (Dy + 4¢,)-(mPa,, + Ry X a dr. 
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However, the external field A is such that R,(A) = m?A, (equation (20), with 
J, = 0, for we are omitting matter terms) and this substitution tells us 


CLYMM = i exp {i i) [L-ma(D) — mD,o,, — 4a, X o,,] a-ha (71) 


valid for any a. 
We use this transformation in the following way. First, multiply the original 
expression (65) with D called D’ by the path integral (a functional of A) 


X= | exp {3 | (ma! + @.,,4)? ar\oa (72) 
to get 
X-CLYMM = i ete {3 | [Lewa(D’) + 4(ma! + a, ,)?] a-haD'9e. (73) 
Now define the linear operator G: 


Go’ = 4)... + ma’ (74) 
to substitute a’ by 
a =a-—G'D,, (75) 


so Ga’ = Ga — Dj, and a’ = Ba, so that we get 
X-CLYMM = | exp {if [Limu(D’) + 4(Di,, — m@a — @,,,)7] a-\9D'9a. 


Now substitute D’ via (69) and use the algebraic steps leading from (65) to (71) 
to get 


X-CLYMM = | exp 5) [Lymau(D) — m?D_,0,, — 4m?a,,0,, 
+ 4(D,,, — m?a)?] a-\onaa 
= [exp {5 [ Wha) + 40,0" — dma, 
+ tm'*a?] a-\9D9a. (76) 
The integral on D, @ has again separated, that on D being just CLYMMD and 


that on @ (putting a = P/m, and changing normalization constants, so Ja 
ZP) is just CLSM of (68) so that in (76) we have shown that 


X-CLYMM = CLYMMD.-CLSM. (77) 
To finish the proof of (67), we must merely show that X defined in (72) is 


X = (CLSM)?. (78) 


903 


904 


Quantizing the gravitational and Yang-Mills fields 393 


This can be seen by noting, with the definition (74), 


x= | exe {i i (aGGa) ar | (79) 


while 
i 
CLSM = | a {i | (PGP) aor. (80) 


Gaussian integrals are proportional to the reciprocal square root of the deter- 
minant of the coefficient matrix in the quadratic exponent. Therefore, (80) is 
proportional to the —1/2 power of the functional determinant of G, while for 
(79) it is G?. But the determinant of the square of a matrix is the square of the 
determinant of the matrix so that, except for an unimportant constant factor, 
as far as dependence on A is concerned, equation (78) holds, and equation (67) 
is established. 

We may now see what is going on if we attempt to calculate CLYM directly 
for m? = 0. When we do, the integrals on D, usually have a convergent integral 
but now there is one “direction” into which we may move D,, namely, D, > 
D, + @,:,, Which does not affect the integrand (see equation 7] for m? = 0) 
so the integral in this “direction” diverges. We attempt to put a convergence 
factor on it of the form m?D,D, and everything turns out to be all right as 
m? —> 0. But if instead we try 4(D, ,,)? as a convergence factor (CL YM D), we are 
controlling the integral in a way that depends on A, and we shall have to divide 
by CLS to take out this dependence. 

As a final note, we give another way of dealing with things that appear to be 
simpler and to make the gauge invariance of the final result more obvious. On 
the other hand, however, it seems not to give the same result. Suppose, first, we 
form a different version of YMD, say YMD', by modifying Lyyp of (63) by 
adding the more obviously covariant 4D.,D.,, instead of 4D,,D,, as in (66). 
Next, suppose that in calculating CLYMM from (65), the external field A, is 
calculated for the case m? = 0, and is not altered when we consider adding the 
mass term m?D,D, and later varying m? toward its limit zero. CallthisCL YMM’. 
This means that we use R,(A) = 0 (rather than R,(A) = m?A,) in the steps 
after (70). Then in (72) and (74) replace @.,, by «,,-,, and in (75) use D;., for 
D,,... What one finally finds is that 


CLYMM' = CLYMMD'/CLSM’ (67) 
where 
CLYMMD! = { exp {i i [Levem(D) + 4(D,:)°] a-}oD (66') 
and 
CLMS’ = | exp { -§ | [P.,P., — m?P-P] az} 9P. (68’) 


The quantity CL YMM’ is calculated from an expression that looks exactly like 
(65) except A, is different. The A, is calculated with m? = 0; it satisfies R,(A,) = 
0. Now the result on the right-hand side of (67’) has a limit for m? —> 0, namely 
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just put m? = 0 in the Lagrangians. One is tempted to define this expression for 
m? = 0 as the value for one closed loop for the YM theory; it is more evidently 
gauge invariant than our previous expression, the right side of (67) with m? = 0, 
However, it is not evident that this limit of (67’) is the same as the limit (67). 
Direct calculations of examples indicate that they are not the same. Expression 
(67) seems more satisfactory because it comes from the trees argument, and it is 
the limit of a complete theory (YMM) using the same rules for diagrams for all 
lines whether they be external or are on a loop. In the gravitational theory, in 
which the method of including a mass in the original equations is not completely 
obvious (although it is in the second order Lagrangian L’), it is much easier to 
find and define the analogue of (67’) (see the Appendix). 


PROCESSES WITH MORE THAN ONE CLOSED LOOP 


Processes with more than one closed loop can probably be defined and calcu- 
lated in terms of lower order processes by using unitarity to obtain the imaginary 
part and a dispersion relation to obtain the real part. Since the lower order 
processes (even up to one loop) have been successfully defined, probably a 
complete theory now exists, at least in principle. However, I have tried to find 
an explicit formula for these higher processes in the massless case in terms of 
rules for diagrams so that an explicit expression for a diagram of any complexity 
can be written down. I have not yet succeeded in doing this, but should like to 
report in this section on a few observations and relations noticed in attempting 
to do this. 

The problem is to find a definition of closed loop diagrams for m? = 0 (say 
for definiteness, one with two closed loops), so that the result is consistent with 
unitarity and also with gauge invariance. The latter condition takes the form of 
the condition that an additional perturbing potential that is a pure gradient, 
5A, = V, «a, will have no effect in first order. 

We cannot define the answer directly as a path integral, for that integral is 
undefined because of the singular kernel (m? = 0). We expect, by analogy to the 
one loop case, that we could give the answer in terms of the YMD theory, which 
is not singular, with some modification (like subtracting the contribution from 
a scalar particle) but I do not know what the modification must be in the general 
case. 

Alternatively, we could try to define the answer (for two loops, say) by putting 
together lower diagrams with one loop, according to the principles of [I]. How- 
ever, here the parts which must be assembled (see [I], last section) are not com- 
plete sets of diagrams for a complete process in lower order. For a set of 
diagrams which is not a complete process, the result of the gradient potential is 
not zero but is somewhat more complicated, so it becomes difficult to insure 
gauge invariance. I have not yet carried it through carefully to see where this 
leads. 

The attempt that I did try at first, and the only one for which the study is 
virtually complete, was to try to define the general result as the limit of YMM 
as m? —> 0. First I explicitly calculated the result for two connected loops (as in 


905 


906 


Quantizing the gravitational and Yang-Mills fields 395 


the Greek letter 4) in an external potential for the Yang-Mills field with m? # 0 
using the propagator (23) or rather its analogue for propagation in an external 
field A,. I then compared the result to what one calculates with propagator 
analogous to 8,,(g2 — m?)~? (that is, of YMMD in an external potential) and 
calculated the difference. The idea was to find an expression for the difference 
so I would know just how YMMD must be modified to get YMM. If this 
modification had a limit as m? 0, I would find the natural modification of 
YM D needed to define and calculate YM. 

For two coupled loops there are three propagators, so the qg,g,/m? of equation 
(23) leads at first sight to terms of order m~®, and having no limit, as m? — 0. 
But we learned in the one loop case that the m~? term can be simplified, and in 
fact does have a limit. I have succeeded in simplifying the m~® terms and have 
gotten rid of terms of order m~® and m~* but am still left with a term of order 
m~? which I am unable to simplify further. I conclude that for two coupled 
loops there is no limit as m? —-0, and therefore, in general, the Yang-Mills 
theory as usually formulated and interpreted in diagrams in the usual way is a 
theory that does not have a limit as m? —> 0 and that such a limit cannot serve 
as a definition of YM. (By Yang-Mills with mass, we mean to add simply 
—4m?A,-A, to the Lagrangian, This is arbitrary; perhaps other terms depending 
on m might be added to make a better definition, and one that does yield a 
limiting theory as m? —> 0.) 

I must therefore return to other methods of defining YM for m? = 0; I have 
not yet done this. 

During the course of investigating these matters, I did find a way to express 
the m~? terms which I was able to generalize to all orders of closed loops in the 
complete theory. What I did first was to express the m~? terms in terms of the 
YMMD propagator so that limits could be easily studied, but I could not guess 
at the general form of this expression for loops of arbitrary complexity. How- 
ever, when I expressed these m~? terms in terms of the YMM propagator I was 
able to see the generalization. This relation, however, was no longer in a form 
that permits one to see that the limit surely does not exist, but it is nevertheless 
interesting, and we discuss it in the next section. 


RELATION OF YMMD TO YMM 


Ideally we should like to express the complete YMM in terms of YMMD 
with some modification, and study the modification as m?-+0. The ex- 
pression may be difficult because in first order it says that to find YMM one 
should subtract from YMM D the result of a scalar particle loop. This subtraction 
process may be hard to define and find in higher order. I thought, therefore, 
that since this also says that in first order YMMD is equivalent to the action 
of a pure vector particle YM™M plus a scalar particle, such a relation is physical 
and might be generalized. Therefore, although it is not what I really desired, I 
was led to see whether I could express YMMD as the action of a vector and a 
scalar particle (both acting nonlinearly, and in interaction). It is possible and 
the result is given here. 
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We shall no longer separate out tree diagrams and closed loop diagrams, but 
consider the theory as a whole in all orders as a path integral. 

To do this, we first express the amplitude for the Yang-Mills theory with 
m? # O and with the divergence term added (omitting other matter for simplicity 
of exposition) as a path integral, as in equation (33), 


amp (YMMD) = | exp {i | [Lya(A) + 4(A,,,)2 — 3m?A,-A,] ar\On. (81) 


We may multiply by the inessential constant 
const = | e~ Ham2P" QP (82) 


where P is a space-scalar isospin-vector field, to get 


amp (YMMD) = | exp {i | [Lym(A) + 3(A,,,,)? — 4m?A,-A, 


— jm?P?] ar |OAP. (83) 


We shall now make some transformations of the integrand in the exponent. 
First we substitute P-> P — m~1A,_, (this does not change YP) and perform 
an integration by parts to convert this integrand to 


Lyy(A) — 4m?A,-A, — mA,-P., — 4m?P?. (84) 


Now we shall make a finite gauge transformation of A, to Aj,, using the notation 
of equation (16) with a, = t-A, with 


o = exp (—im™!t-P). (85) 


This leaves the expression for Ly,(A) unchanged, or Lyy(A) = Lyy(A’). The 
volume element in the path integral space remains unchanged in form, because 
the addition of T, does not change the differential MA and the rotation of o 
among the isospin components at each space time point makes no change for 
we are integrating with respect to all of the components. The expressions A, +A, 
and A,-P_, are changed, however, and we are left in the integrand with 


Lyy(A’) — Ym7A,,-Al, + mAA;-o(T, — m-P Jo“ 
—4m?T,-T, + mT,-P., — 4m°P?. (86) 


Next we expand T, explicitly as in (17’) but with a = —m71P. The quantity 
oP _,o* is similarly expanded just as in (15’). More strictly, we should have 
written the term Aj,-o(T, — m~'P_,)o~} as Tr(ajo(t, — m~+x+P,)o~*) but we 
hope the meaning was clear. We obtain, after a number of integrations by parts, 
the result that the integrand in the exponent of (83) may be replaced by (dropping 
the prime on Aj) 


m2 
Lyy(A) _ 4m?A,-A, _ 2 


P-P + Po {dP + |p xP, 
3m : 


1 
+ ge P XP XP,) te}. (87) 
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Here P., = P, — A, X P and the series continues, with the n-th term being 
P.,-{(n + 2)n! m")-1(P XP}. (88) 


The first two terms are the Lagrangian Lyy,,(A) of the usual Yang-Mills theory 
with mass. The remaining terms are the nonlinear Lagrangian of a spin zero 
isospin vector particle of mass m? coupled to the vector field A, and to itself. 
The Lagrangian of this field is 


1 
Lscauan m(P) = —}m?P-P + Pin 4P,, + Im? xP, 


1 
+ PX XP) + (89) 


This is what we set out to prove in this section. 

The expression to third order (and partly to fourth) has been confirmed by 
actual calculations on double loops. In fact, it was from such calculations that 
the form of the third and higher terms were guessed, and the proof was only 
found after the form was known. 

When matter is present, no change in the scalar particle Lagrangian results; 
it couples purely to itself and A,. 

To second order in P, the Lagrangian is precisely equation (61). Thus, for a 
single loop, where this order is sufficient, we have that YMMD is YMM plus 
a scalar particle (61), or YMM is YMMD minus this particle, a result which we 
proved before in another way. To this order, we can take the limit m? — 0. 

The theorem to all orders is not useful for the case m? — 0, for the expression 
(87) is meaningless in this limit. Since YMMD does have a limit as m? —> 0, this 
suggests (but perhaps does not directly prove) that YM™M has no such limit. 

To answer such questions, it would be much more satisfactory to have an 
expression for YMM in terms of YMMD and modifications instead of the 
other way around, but this I have not derived. 

At least one can prove from (89) that terms in perturbation theory having 
just two closed loops in the form of a 6 do not have a limit in the Yang-Mills 
theory with mass as m? +0. Direct calculation of these terms have confirmed 
this result. 


APPENDIX 


The quantization of the gravitational field presents us with problems nearly 
completely analogous to those of the Yang-Mills theory of the text. We point 
out some of these analogies in outline in this appendix in which the gravitational 
field equations are developed following the text development for the Yang-Mills 
field. The equations will be numbered as (na) where (n) is the analogous text 
equation. We shall formulate the gravitational field in the special case that it is 
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in interaction with a scalar matter field (our problems are independent of the 
kind of matter used °). 


Latter = Z(g, $) = Ve 4(e""$.1h,9 = m?$?) (la) 


where g,, is the metric tensor, g its determinant, and g“” its reciprocal. The field 
Lagrangian is that of Einstein: 


Le = Kg) (4a) 


where & is the scalar curvature tensor density or # = #,,g"" with Z,, = 


V2(Ryy — $2,,R%), where R,, = R%,, and R?,, isthe Riemann curvature tensor. 
This Lagrangian is invariant under a general coordinate transformation x* —> 
Jf*(x*) in which transformation the field quantities g,, are transformed to 


Sap = (Swf af” ads (16a) 


We must substitute f“(x) for x“ in the functional form of g,,. Commas denote 
ordinary differentiation. For infinitesimal transformations, f* = x" + 7*(x), 
this becomes 

Suv > Suv + 0 Boy + 0° vBoun + 0°Suy,0 (2a) 


to first order in 7. We call (2a) a gauge transformation. We define the covariant 
derivative of a vector (which thereby becomes a tensor) as 


4 = At, +f \as (3a) 


veg 
in the usual way (including its usual generalization to differentiating higher 
tensors). For example, if /#” is a tensor density, 
13 bed Ot 
SHY = SHY + SF. 
i ° oT 
Here, 


iad v 
{ } = tg" [Zov,t ae uw. — 8or,v]- 
Cc T 


The Riemann tensor can then be defined by 
Ata = AXgy = Ri, A*. (6a) 


The variation of [ Le(g) dr with respect to g,, is —#*(g). The fact that 
Lg dr is invariant under the transformation (2a) implies that the tensor density 
#R’ satisfies 


Cc 


Bey = BY 4 { ‘ jan = 0, (8a) 
T 


3 This coupling is not unique. For a full discussion of this question and for discussions of 
coupling to other matter fields (for example, spin 4), see Elisha Huggins, Ph.D. thesis, 
California Institute of Technology. 
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By integrating by parts, the Lagrangian may also be written as 


fuine-[veL deal 


This #4” is a nonlinear operator. If g,, is expanded as 8,, + h,y, the part #4” 
of #*” linear in h,, comes from the quadratic part of the Lagrangian 


Lag(h) = +4(lyy oMuv,o) ae Ravine’ (9a) 


where we define the bar operation on a tensor (now, in the Minkowski sense, 
not a tensor in the fully covariant sense) by 


C;, = HC,, + Cv, = 8y»Coa) 


(repeated, the bar operation on a symmetric tensor restores the original tensor). 
Thus 


By, (A) = Nuy,ao = Figo ad hyo,on 
or in momentum space 
— By, (h) = (q? 8y0 by. — Wr 8u0 —~ Udo by. + S yo Mar (J la) 


This operation is singular and is without inverse because identically 


Roy Ah) =0 
or 
GvRouy = 0, 


The first variation with respect to g,, Of Lmatter from (la) we shall call 7’. 
The full equations for the gravitational field with matter present becomes then 


RM) = Tg, $). (14a) 


The equations for the motion of the field ¢ obtained from variation of Letter 
with respect to ¢, (namely, g“’d.,,, + m?¢ = 0) imply that 


TF = 0, (13a) 


a condition without which (14a) could not make sense in view of the identity 
(8a). 

To make a quantum theory, we would expect at first to write exp i f Ldrand 
integrate over all DgD¢ as a path integral in the usual way (the integral over 
Zyy at a point is g °/? times the product of the ten dg,,). But this leads to the 
difficulties described for the YM case in the text. One can easily convert it into 
rules for diagrams in various orders. One way is to substitute g,, = d,y + Ayy 
and expand in powers of /,,. The second order in #/ leads to the singular propa- 
gator (11a), so if the other terms are called sources we have 


T ohh) = Syy 
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Other terms are represented as interactions between matter and gravity in the 
usual way. For example, matter is coupled in first order to A,, via a term 
hob .b.¢ — ¥m*h,,$?, which leads to a rule that matter, in going from momentum 
P tO po, is coupled to one graviton of tensor 4,, by an amplitude 


piPs — 4 5,,(p*-p? — m*). 


There are new coupling terms of every order; that is, of junctions of every 
number of lines. We have made many calculations in this way to study the nature 
of the divergences of this theory. Here, however, we do not discuss these, but 
only problems in formulating the theory which arises from the singular nature 
of the operator &,,, of (11a). 

A free graviton entering or leaving has a symmetrical tensor polarization e,, 
and a momentum gq, satisfying #2,,(4) = 0 or (see equation (11a)) 


qe yy ~~ WIC _ WwW + 890201 = 0, (34a) 
If it is physical g? = 0 and (34a) says it is also transverse, 
Tew = 0. (35a) 


On the other hand, (34a) has solutions even if g? # 0, namely a quantum partly 
polarized in the direction g,, the gradient of a vector &,: 


Cu = Guéy af Gikis (36a) 


a form we shall call a pure gradient, and one which corresponds to a pure gauge 
transformation (compare equation 2a). Such a potential has no physical effect 
and always gives zero effect on running all diagrams containing it. 

To deal now with the second order propagator, #,,(h), we can discuss a 
theory in which we modify the second order terms in the original Lagrangian. 
The modification we shall consider is to add h,, ,4,¢,¢ to the Lagrangian so that 
it becomes simply 


Logp = Sis clio: (25a) 


In all these equations by this method, we no longer have complete general 
relativistic covariance. We adopt the convention that repeated indices on the 
same horizontal line are to be summed with the Minkowski metric 5,,. Indices 
repeated one upper, one lower are summed in the conventional way. 

The field produced from a source is now simply 


Ayy = (1/9)S,»- (28a) 
Of course, if the source is divergenceless, (28a) implies 
heyy = 0, (22a) 


so in that case our system is equivalent to the unmodified equation (14a). Further 
discussion exactly parallels the YM case. 

It appears, to get the single closed loop diagrams to work out satisfactorily 
if we add h,,,h,.,. to the gravity Lagrangian so that we may use (28a) as a 
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propagator, we must add the effects of a vector particle, field vector 7? going 
around a loop and having the Lagrangian density 


20039 = ny + hemvnty — 9°" Tys,6 — higs x) (29a) 


where we have used the condition (22a). It is interesting that only terms of first 
order in A,, appear here. 
The tree diagrams we dealt with in the following way. The Lagrangian is 


L= Xg) + F(g, 4) 


and the equation of motion resulting from it is 


8/62, = uw(Z) a T Ag, $) = 0, 
SL/8h = Wb, 2) = —[Vee"'b]) + m'oVe = 0. 


(We shall henceforth write second derivatives like (¢,,,),y aS ¢,u», etc.) We think 
of g,, as 5,, + A,, and split off the second order operator from the rest, calling 


Rg) = Beg) — Kg), (39a) 
RF (zg) = 2g) = By0.0v — Zvo,0u + Sy vBer,0% (40a) 


(We have written 24’(h) as Z4"(g), for they are the same.) Note that 2’, = 0. 
We write Wid, g) = —L*d + md + 5(¢, g). Thus, Zep is just 1/0]? times the 
bar operation. We then construct trees by the iterative solution of 


Buy = Rap{7 (8) + K(g)} + Biv» (41a) 
$ = (CL? — m’)-Xs(g, 8)} + $°s (43a) 

where g@,, 6° represent appropriate asymptotic waves, 
AB(B°) = R20(8°) + Braver + Bia,ou — SurBor,0r = 9. (44a) 


If one of the asymptotic gravitons E,, is left out of the asymptotic waves gf, 
the final tree amplitude is 


oes } E, (7g, 6) + K*(g, $)) dr. (45a) 


Multiplying (41a) by Zp shows that g,, satisfies 
Rg) = Tg) — 2C,,, + A380) (46a) 
where the last term is zero by (44a), and we have put 
Cy = Bya,0 — B¥a,0- (47a) 


Taking the covariant derivative of both sides of (46a) shows us that 


2(Cy,u):» =0= Cy yy + 2f . En (48a) 
oT 
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We are solving this via (49a) as we show as follows: Taking the bar and then the 
divergence of both sides of (41a), replacing K“” via (39a), we can show 
Cy, = OO %-BY + TY + BBY}. 
The last term vanishes, and the others may be re-expressed (since their covariant 


derivative vanishes) as -{," ke ot _ &%) which by (46a) leads us to 
T 


fone -o-e{ end (49a) 


so C, is zero everywhere as it is zero asymptotically, 

To show that the tree result does not depend on g?, if it is changed by a first 
order gauge transformation, g?, >g°, + é,, + §,,, We show that a pure 
gradient for the last line £,, = €,, + &,,y has no effect in (45a). Integration by 
parts gives 


ie -2 | f(T” + K*), dr, (50a) 
which can be re-expressed as 
— 4] af # joe eZ (Sia) 
C T 
since C, vanishes. 
When gf, is changed by &, , + §,,u, Zyy is changed by a gauge transformation 
(2a) OF Byy > Buy + Xuv + Xu — 2f 1 hye with 7” = g#*y,. The condition 
C, = 0, or ,»,y = Zev,y means that the y is related to the € via yyw = Ey + 


2({ }xe)o which is solved by iteration of 


ve = OH A({ hae) op + be (52a) 


an equation adjoint to (60a). 

For single closed loops, the yp theory does not agree with expectations when 
the loop is opened just as for the analogous YM case. To deal with single closed 
loops by closing a tree, we shall have to close a tree with an external gradient 
(£,,. + &,v) and another external graviton, say b,,, where 


bss i 4é,. (55a) 


Now, if g?, contains all the physical external lines and also b,,, we have (note 
q? = 0, so #$5(b) = 0) 
Rg") = Rb) = — buy (56a) 


to put into (46a). We next define a new C, by adding a term 4€, to (47a). Now 
our tree is (Sla) but C, solves by iteration the equation 


co-lead (60a) 
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If we choose g,,,, = 0, then {, | = 0 and the bar can be left off of C,.,. 


It appears that this tree would be the tree coming from the closed loop dia- 
gram of a field 7* coming from an action 


[ {soon + 18 v0,2 — eral} dr (see 29a) 


where we explicitly assume that Z,,,, = 0 and where the o, 7 are summed co- 
variantly but the index v is summed by Minkowski’s metric, The first variation 
of the action with respect to 7" is 


T 
—8ur (ri. + 2 jn) 
Cc v 


as required to generate equation (60a) with C, = 7”. The Lagrangian of this 
action can also be written 


L, = nyt yw (29a) 


We now give the equations analogous to those in the section “ Formal Theory 
of Simple Loops.”’ We imagine the tree fields as external fields g,,, and work out 
the contribution of one extra closed loop, That is, (see 62), we replace g,, by 
2,» + H,, in the Lagrangian and expand to second order in H,, (we omit the 
matter field here), The second order Lagrangian which results is 


Lg = | Ve dr(— AY AZ, + te" AY Hise 7 AHR, 
+ AY HR yy). (63a) 


Here, g,, is a fixed external field, R,,,, is the curvature tensor of the field g,, 
and we define for any tensor X,,, 


be = (Xu + Xu — BuvXG) 


and all raising and lowering of indices, and Christoffel symbols in the definition 
of the covariant derivative, is with the metric tensor g,,. This is the Lagrangian 
of a tensor field in curved space. For example, it would be the proper Lagrangian 
for discussing weak gravity wave propagation in an otherwise strong given field; 
that is to say, it would describe weak gravitational effects in a curved space. 

Now we should like to use the trick of adding a mass term to the original 
gravity Lagrangian and we need the analogue Le, of (19). There is no obvious 
and unique way to add the term in the gravitational case, but we have found the 
answer (corresponding to our particular way of modifying the gravity propa- 
gator) by the supposition that the term must be of such a form that the field 
condition 


Euy,v =0 (22a) 
should be a consequence of the equations of motion. This will occur if we take 


Lom = Le = M854 (19a) 
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(so —M? i Seo dt is added to the action), for then the equations of motion 
become 


Rg) + M?8,, = T*(g). (20a) 


So that taking the covariant divergence of both sides, we find 


Buy. +{ } 24. = 0, 
oT 


which implies (22a) since ea = g"*(Z,,.,). This mass term is first order in 


g,, and does not affect the second order Lagrangian so that Lg, = Lg. This is 
not analogous to the YM case, but for uniformity of notation we continue to 
call it Lgy, although it equals Lg. 

The theory Léyp will add a divergence term to Lgy so that the propagator is 
nonsingular, 


Loup = Lg + | dr(g** Hayy g0,0): 


The closed loop contributions are taken as path integrals over these Lagrangians 
over the fields H,, with zero asymptotic value (vacuum to vacuum), The loop 
with the mass term will differ from that with the cancelled divergence terms by a 
factor equal to the action of a loop of a vector particle. We shall prove 


CLGM = CLGMD/CLVM (67a) 


where CLVM is the closed loop of a vector particle of mass M? of Lagrangian 
Ly = | del2n%iew — Mand) (68a) 


The path integral on Dy means Vg times the product of the four differentials 
dn" at each space-time point. 

The right side of (67a) has a limit as M2 > 0. We take that limit, leaving out 
the mass terms in the Lagrangian, to define the closed loop theory for gravita- 
tion, Namely, find loops via the Lagrangian (63a), omitting the first term (the 
divergence term). Then correct it by subtracting the action of a massless vector 
particle with Lagrangian density 77 yjo;y- 

({f matter were present, the extra term in the Lagrangian in second order in 
H,, is 


+{ (¢,06,[4Hi — ig" H’'H,,] + 4m26?HH,,)V 8 dr 


where ¢ is the external matter field. To deal with the matter loops we need to 
substitute 6 = ¢ + X and expand to second order in X. The Lagrangian second 
order in X is 4] V2 (g#”X,X — m?X?) dr and the term linear in X¥ and H is 
| (—H°$,,X.y — (m?/2)HebX] Ve dz.) 

Under an infinitesimal coordinate transformation, (2a), g,, + Hj, becomes 
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Say + Ay + 7°.Sov + 1 %vBou + 7°Zyv.o to first order in H and 7. This can be 
taken as a change of H and of X, 


Hiv = Ay + Nav + Win (69a) 
and 
X'= X + 7% o. 


Now if we make this substitution in the Lagrangian, we can show after a 
number of steps analogous to (70a) that the entire action is changed by 


—2My*,(Hox a Tho: u): (71a) 


Since this is a rather complicated calculation and the result is not directly 
obvious by analogy to YM, we outline a way it can be carried out abstractly 
(this method would, of course, work just as well for the YM case). Inclusion of 
matter terms only complicates the algebra and does not alter the result, so we 
omit matter. The Lagrangian density for the field will be written Le — M?g,, to 
include the mass term. Putting g,, > g,, + H,, changes this by 


Ay dLel 58 uy = M*Hoe 
in first order and by 
Lom = $A Ag, 87g | 8B uy Zor 


in second order, The first order vanishes if we assume that the external field g,, 
satisfies the gravitational equations of motion with mass, (20a) with 7” = 0, 
bL¢/5g,, = —2*(g) = +M?5,,. The second order is the Lagrangian we want. 
It is a property of Lg that gauge transformations do not alter it for any g, so 


(6Le¢/dg,»);¥ = (6L¢/82,»),v + : we dL [88 p0 = 0 


identically for any g,,. If into this equation we substitute g,, > g,, + H,, and 
carry to first order in H,,, we find 


((S7L/ gy, 88o)Har),y = ~42 (He + Hye.p — Hoe,v 


A 
_ 2f jet) 5882p (90) 
o p 
for any H,,. In our case, here 5L/6g,, = M? 5,, so (90) becomes 
((83L/ 887 8802)Hor):y = — Mg" Hyo,o- (91) 


Next the change in Lg, produced by replacing Hj, by Ayy + qu:y + Meu 
(which is what we wish to calculate) is 


He Quy 87L/b8yy 8802 + Wu vNor1 OP L/8Euy S80r 


This, by parts, is —2n,(8°L/8g,, 585:(Ho1 + 76:2)).y Which by the analysis of the 
last paragraph is 27*M?(H,, + jo:,),¢ Which by parts proves our contention 
(71a). 
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With this observation (71a), we can proceed to prove (67a) in a way exactly 
analogous to the YM case. First, multiply the path integral for CLGM (with 
H' for H) by the path integral (dependent on g,,) 


X= fexp{i | Un, + 26R5) al? de bn (722) 
where by (€,)? we mean g*’é,&,. Define the linear operation G, 
Gn, a 2ie:u),0 be M*n,, (74a) 
and substitute 7, by 
Nn = Ny — G3 (75a) 


so that the path integral now has for the exponent the integral of 
Lom(H') + [M4y, + 2Gja:u).0 = en 


We now replace Hz, by Ho, + Ne: + u;o and use our rule (71a) to find the 
change in L¢y(#1') to reduce this to 


Lou(A) 7 2M ?9'6( Hou + Toru) + [M?n, aad Fou,01g[M 20, = Ayo] 
which is (convert one term by parts) 
Lem) + gly alias ae M*(M*n"n, _ Qn otie:u): (see 76a) 


The fields H,, and 7, are now separated. The first two terms combine together 
to form the Lagrangian without the divergence, the one proper to CLGDM. 
The last terms redefining a new 7“ as M times the old gives just the path integral 
of the Lagrangian (n"Gn,)V g dr, or (Det G)~ 1/2, whereas the quantity of ¥ of 
(72a) was f (nGGn,)Vg dr or (Det G2)" = (Det G)-!. Therefore, we must 
subtract from the action of one closed loop with the divergence term left out 
(CLGM D) the action of a vector particle with Lagrangian density 


Lym = — 27% oiforu + M?y*n,. (61a) 


Since the limit of each term may be taken as M/?-+0 we can omit these M7? 
terms, simply putting M@? = 0, and thus have a rule to make calculations for one 
closed loop with massless gravity. 

This formulation of gravitation is not unique. One could have started with 
some other nonlinear function of g,, (like g“” or Vgg*”) and expanded that as 
8,y + k,y to begin the theory. We would presumably be led to other forms of 
conditions on g,,, such as (Vgg"”)., = 0 instead of &,v,y = 0, other divergence 
terms to define Lép or Zap and other vector Lagrangians to subtract. Presumably 
they would all give equivalent results. The method used here is very awkward 
with its mixture of Minkowski and covariant indices. I have not investigated 
other possibilities to see if a more obviously covariant formulation could be 
made. 
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There is, of course, one possible covariant definition, the analogue of (66'), 
(68'), but I do not believe the result is entirely satisfactory (see discussion in text), 
What we do here is this. Instead of adding a mass term to the Lagrangian Lg 
from the beginning, we suppose the external field from trees is analyzed in any 
covariant way and that (in the absence of matter) Z,,(g) = 0. Then the loop 
field H,,, is considered as an extra tensor meson in a curved space, but this time 
a tensor field with mass, to form the Lagrangian 


2 — ~ 
Loewe =e = Y | Vg drH”H,, 


by adding the obvious and covariant (in g) mass term. The technique explained 
above can be directly applied in an obvious way. For example, the change in 
Lom(H’) and replacing H’ by H comes only from the mass terms, since (90) is 
0 as 6L/dg,, = 0, and is (all indices covariant, raised by g*’, etc.) 


—2M*7""(H,, oF Nu:wVB- 


The rest of the analysis proceeds by adding 


| Vg g!"(Gn'),(Gn'), dr 


where now 


(Gy), = M*n, + Qiosu:8 
The result is that 


GMD! ea Lom +f | V2 dr( AY Ag: 18%)- 


(thus cancelling the first term of 63a) and the vector particle to be subtracted 
has the Lagrangian 


Pym = — | Vg dr[29! ou: 87 _ M?*n,). (see 68'a) 


This is evidently covariant, has a limit as M@?->0, but I believe the limit may 
not satisfactorily agree with the tree opening theorem. 

The analogy to YM is very close except the mass term (in 19a) is quite 
peculiar, and it does not affect the loop Lagrangian. This causes one to question 
whether it might not be easier in gravitation than in the YM theory to deal with 
closed loops of arbitrary complexity. I did not find the solution to the problem 
of how to deal with such loops in either theory. Formula (19a) was not known 
to me then as I found it only when preparing this manuscript for publication, 
for some of my older formulas contained errors that I had not noticed. 


Note added in proof: L. D, Faddeev and Bryce DeWitt have kindly prepared for 
me the following list of references to related work. 


I. Yang-Mills and Gravitation 
1. B. S. DeWitt, Phys. Rev., 162, 1195, 1239 (1967). 
2. B. S. DeWitt, Phys. Rev. Letters, 12, 742 (1964). 
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3. V. N. Popov and L. D. Faddeev, “Perturbation Theory for Gauge In- 
variant Fields,” preprint I.T.Ph., USSR, Kiev (1967), 

4. S. Mandelstam, PAys, Rev., 175, 1580, 1604 (1968). 

5, S. Mandelstam, Ann. Phys. (New York), 19, 25 (1962). 

Yang-Mills only 

1, L, D. Faddeev and V. N. Popov, Phys. Letters, 25B, 29 (1967), 

2. L. D. Faddeev, Theo, and Math. Phys., 1, 1 (1969), 

Canonical approach (Hamiltonian) 

1, P. A. M. Dirac, Can. J. Math., 2, 129 (1950). 

2, P. A. M. Dirac, Proc. Roy. Soc. (London), A246, 326, 333 (1958). 

3. P. A. M. Dirac, Phys, Rev., 114, 924 (1959). 

4, R. Arnowitt, S. Deser and C. W. Misner, “The Dynamics of General 
Relativity,” Gravitation, an Introduction to Current Research, L, Witten, 
ed. (New York: Wiley, 1962), 

5. B. S. DeWitt, Phys. Rev., 160, 1113 (1967). 

6. J. Schwinger, Phys. Rev., 152, 1219 (1966); 158, 1391 (1967); 173, 1264 
(1968). 
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1, I. B, Khryslovich, Yadernaya Fizika, 10, 409 (1968); English translation: 
Sov, J, Nucl, Phys., 10, 235 (1970). 
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VII. Computer Theory 


Feynman’s interest in numerical computation dated back to his wartime Los Alamos days, 
when Bethe put him in charge of a group doing calculations to model the plutonium implosion 
bomb, Feynman developed a system, using persons at mechanical calculators in a system 
analogous to what would much later, with digital computers, be called “parallel computing.” 
During the last ten years of his life he became fascinated by the theory and application of 
computers and he gave a joint course in computation at Caltech, together with his colleagues 
John Hopfield and Carver Mead, also using guest lecturers.! Feynman also published three 
papers on computers, (Papers [110] and [115] are the same.) 

The most important of the three papers is probably [106], in which the idea of a quantum 
computer is suggested and in which the limitations on computers imposed by the laws of 
physics are discussed. In some of his ideas, Feynman had been paralleled (and sometimes 
anticipated) independently by others, especially Paul Benioff and Rolf Landauer. Paper [110] 
discusses reversible computers (first considered by C.H. Bennet) and limitations that could 
arise from the second law of thermodynamics (the increase of entropy with time). However, 
the paper concludes with the sentence “At any rate, it seems that the laws of physics present 
no barrier to reducing the size of computers until bits are the size of atoms, and quantum 
behavior holds dominant sway.” 

Paper [113] is a lecture delivered in Japan in 1985 as a memorial to Yoichiro Tomonaga, 
who shared the 1965 Nobel Prize in Physics with Feynman and Julian Schwinger. It is 
a “popular” presentation discussing parallel computation and the possibilities of reducing 
energy consumption and size.” 


Selected Papers 

[106] Simulating physics with computers. Int. J. Theor. Phys. B2 (1982): 467-488. 

[110] Quantum mechanical computers. Opt. News 11 (1985): 11-46. 

[113] The computing machines in the future, Nishina Memorial Lecture (1985), Nishina 
Foundation and Gakushuin. 

[115] Quantum mechanical computers, Foundations of Physics 16 (1986); 507-531. 


See item [124] for the Feynman Lectures on Computation. See also Feynman and Computation, edited by 
Anthony J.G. Hey (Reading, Massachusetts: 1999). 

2On the last topic, see also paper [44], “There’s plenty of room at the bottom.” “(This] paper is often credited 
with starting the field of nanotechnology.” Feynman and Computation, p. xii (note 1). 
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1, INTRODUCTION 


On the program it says this is a keynote speech—and I don’t know 
what a keynote speech is, I do not intend in any way to suggest what should 
be in this meeting as a keynote of the subjects or anything like that. I have 
my own things to say and to talk about and there’s no implication that 
anybody needs to talk about the same thing or anything like it. So what I 
want to talk about is what Mike Dertouzos suggested that nobody would 
talk about. I want to talk about the problem of simulating physics with 
computers and I mean that in a specific way which I am going to explain. 
The reason for doing this is something that I learned about from Ed 
Fredkin, and my entire interest in the subject has been inspired by him. It 
has to do with learning something about the possibilities of computers, and 
also something about possibilities in physics. If we suppose that we know all 
the physical laws perfectly, of course we don’t have to pay any attention to 
computers. It’s interesting anyway to entertain oneself with the idea that 
we've got somethung to learn about physical laws; and if I take a relaxed 
view here (after all I’m here and not at home) [ll admit that we don’t 
understand everything. 

The first question is, What kind of computer are we going to use to 
simulate physics” Computer theory has been developed to a point where it 
realizes that it doesn’t make any difference; when you get to a universal 
computer, it doesn’t matter how it’s manufactured, how it’s actually made. 
Therefore my question is, Can physics be simulated by a universal com- 
puter? I would like to have the elements of this computer J/ocally intercon- 
nected, and therefore sort of think about cellular automata as an example 
(but I don’t want to force it), But I do want something involved with the 
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locality of interaction. I would not lke to think of a very enormous 
computer with arbitrary interconnections throughout the entire thing. 

Now, what kind of physics are we going to imitate? First, I am going to 
describe the possibility of simulating physics in the classical approximation, 
a thing which is usually described by local differential equations. But the 
physical world is quantum mechanical, and therefore the proper problem is 
the simulation of quantum physics—which is what I really want to talk 
about, but I’ll come to that later. So what kind of simulation do I mean? 
There is, of course, a kind of approximate simulation in which you design 
numerical algorithms for differential equations, and then use the computer 
to compute these algorithms and get an approximate view of what physics 
ought to do. That’s an interesting subject, but is not what I want to talk 
about. I want to talk about the possibility that there is to be an exact 
simulation, that the computer will do exactly the same as nature. If this is to 
be proved and the type of computer is as I’ve already explained, then it’s 
going to be necessary that everything that happens in a finite volume of 
space and time would have to be exactly analyzable with a finite number of 
logical operations. The present theory of physics is not that way, apparently. 
It allows space to go down into infinitesimal distances, wavelengths to get 
infinitely great, terms to be summed in infinite order, and so forth; and 
therefore, if this proposition is right, physical law is wrong. 

So good, we already have a suggestion of how we might modify 
physical law, and that is the kind of reason why I like to study this sort of 
problem. To take an example, we might change the idea that space is 
continuous to the idea that space perhaps is a simple lattice and everything 
is discrete (so that we can put it into a finite number of digits) and that time 
jumps discontinuously. Now let’s see what kind of a physical world it would 
be or what kind of problem of computation we would have. For example, 
the first difficulty that would come out is that the speed of light would 
depend slightly on the direction, and there might be other anisotropies in 
the physics that we could detect experimentally. They might be very small 
anisotropies, Physical knowledge is of course always incomplete, and you 
can always say we’ll try to design something which beats experiment at the 
present time, but which predicts anistropies on some scale to be found later, 
That’s fine. That would be good physics if you could predict something 
consistent with all the known facts and suggest some new fact that we didn’t 
explain, but I have no specific examples. So I’m not objecting to the fact 
that it’s anistropic in principle, it’s a question of how anistropic. If you tell 
me it’s so-and-so anistropic, I'll tell you about the experiment with the 
lithium atom which shows that the anistropy is less than that much, and 
that this here theory of yours is impossible. 
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Another thing that had been suggested early was that natural laws are 
reversible, but that computer rules are not. But this turned out to be false; 
the computer rules can be reversible, and it has been a very, very useful 
thing to notice and to discover that. (Editors’ note: see papers by Bennett, 
Fredkin, and Toffoli, these Proceedings). This is a place where the relation- 
ship of physics and computation has turned itself the other way and told us 
something about the possibilities of computation. So this is an interesting 
subject because it tells us something about computer rules, and might tell us 
something about physics. 

The rule of simulation that I would like to have is that the number of 
computer elements required to simulate a large physical system is only to be 
proportional to the space-time volume of the physical system. I don’t want 
to have an explosion. That is, if you say [ want to explain this much physics, 
I can do it exactly and I need a certain-sized computer. If doubling the 
volume of space and time means I’l] need an exponentially larger computer, 
I consider that against the rules (I make up the rules, I’m allowed to do 
that). Let’s start with a few interesting questions, 


2. SIMULATING TIME 


First I’d like to talk about simulating time. We’re going to assume it’s 
discrete. You know that we don’t have infinite accuracy in physical mea- 
surements so time might be discrete on a scale of less than 107’ sec. (You’d 
have to have it at least like to this to avoid clashes with experiment— but 
make it 10~*! sec. if you like, and then you’ve got us!) 

One way in which we simulate time—in cellular automata, for example 
—is to say that ‘‘the computer goes from state to state.” But really, that’s 
using intuition that involves the idea of time—you’re going from state to 
state. And therefore the time (by the way, like the space in the case of 
cellular automata) is not simulated at all, it’s imitated in the computer. 

An interesting question comes up: “Is there a way of simulating it, 
rather than imitating it?” Well, there’s a way of looking at the world that is 
called the space-time view, imagining that the points of space and time are 
all laid out, so to speak, ahead of time. And then we could say that a 
“computer” rule (now computer would be in quotes, because it’s not the 
standard kind of computer which operates in time) is: We have a state s, at 
each point j in space-time. (See Figure |.) The state s; at the space time 
point jis a given function F(s,, 5;,,...) of the state at the points j, k in some 
neighborhood of i: 
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Fig. 1. 


You’ll notice immediately that if this particular function is such that the 
value of the function at j only involves the few points behind in time, earlier 
than this time 7, all I’ve done is to redescribe the cellular automaton, 
because it means that you calculate a given point from points at earlier 
times, and I can compute the next one and so on, and | can go through this 
in that particular order. But just let’s us think of a more general kind of 
computer, because we might have a more general function. So let’s think 
about whether we could have a wider case of generality of interconnections 
of points in space-time. If F depends on ail the points both in the future and 
the past, what then? That could be the way physics works. Il] mention how 
our theories go at the moment. It has turned out in many physical theories 
that the mathematical equations are quite a bit simplified by imagining such 
a thing—by imagining positrons as electrons going backwards in time, and 
other things that connect objects forward and backward. The important 
question would be, if this computer were laid out, is there in fact an 
organized algorithm by which a solution could be laid out, that is, com- 
puted? Suppose you know this function F, and it is a function of the 
variables in the future as well. How would you lay out numbers so that they 
automatically satisfy the above equation? It may not be possible. In the case 
of the cellular automaton it is, because from a given row you get the next 
row and then the next row, and there’s an organized way of doing it. It’s an 
interesting question whether there are circumstances where you get func- 
tions for which you can’t think, at least right away, of an organized way of 
laying it out. Maybe sort of shake it down from some approximation, or 
something, but it’s an interesting different type of computation. 

Question: “Doesn’t this reduce to the ordinary boundary value, as 
opposed to initial-value type of calculation?” 

Answer: “Yes, but remember this is the computer itself that I’m 
describing.” 

It appears actually that classical physics is causal. You can, in terms of 
the information in the past, if you include both momentum and position, or 
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the position at two different times in the past (either way, you need two 
pieces of information at each point) calculate the future in principle. So 
classical physics is local, causal, and reversible, and therefore apparently 
quite adaptable (except for the discreteness and so on, which I already 
mentioned) to computer simulation. We have no difficulty, in principle, 
apparently, with that, 


3. SIMULATING PROBABILITY 


Turning to quantum mechanics, we know immediately that here we get 
only the ability, apparently, to predict probabilities. Might I say im- 
mediately, so that you know where I really intend to go, that we always have 
had (secret, secret, close the doors!) we always have had a great deal of 
difficulty in understanding the world view that quantum mechanics repre- 
sents. At least I do, because I’m an old enough man that I haven’t got to the 
point that this stuff is obvious to me. Okay, I still get nervous with it. And 
therefore, some of the younger students ... you know how it always is, 
every new idea, it takes a generation or two until it becomes obvious that 
there’s no real problem. It has not yet become obvious to me that there’s no 
real problem. I cannot define the real problem, therefore I suspect there’s no 
real problem, but I’m note sure there’s no real problem. So that’s why I like 
to investigate things. Can I learn anything from asking this question about 
computers—about this may or may not be mystery as to what the world 
view of quantum mechanics is? So I know that quantum mechanics seem to 
involve probability—and I therefore want to talk about simulating proba- 
bility. 

Well, one way that we could have a computer that simulates a prob- 
abilistic theory, something that has a probability in it, would be to calculate 
the probability and then interpret this number to represent nature. For 
example, let’s suppose that a particle has a probability P(x, 1) to be atx ata 
time t. A typical example of such a probability might satisfy a differential 
equation, as, for example, if the particle is diffusing: 


Now we could discretize ¢ and x and perhaps even the probability itself and 
solve this differential equation like we solve any old field equation, and 
make an algorithm for it, making it exact by discretization. First there’d be 
a problem about discretizing probability. If you are only going to take k 
digits it would mean that when the probability is less that 2~* of something 
happening, you say it doesn’t happen at all. In practice we do that. If the 
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probability of something is 10°, we say it isn’t going to happen, and 
we're not caught out very often. So we could allow ourselves to do that. But 
the real difficulty is this: If we had many particles, we have R particles, for 
example, in a system, then we would have to describe the probability of a 
circumstance by giving the probability to find these particles at points 
X1,Xz,...,Xp at the time #, That would be a description of the probability of 
the system. And therefore, you’d need a k-digit number for every configura- 
tion of the system, for every arrangement of the R values of x. And 
therefore if there are N points in space, we’d need N* configurations. 
Actually, from our point of view that at each point in space there is 
information like electric fields and so on, R will be of the same order as N if 
the number of information bits is the same as the number of points in space, 
and therefore you’d have to have something like N™ configurations to be 
described to get the probability out, and that’s too big for our computer to 
hold if the size of the computer is of order N, 

We emphasize, if a description of an isolated part of nature with NV 
variables requires a general function of N variables and if a computer 
stimulates this by actually computing or storing this function then doubling 
the size of nature (N-2N) would require an exponentially explosive 
growth in the size of the simulating computer. It is therefore impossible, 
according to the rules stated, to simulate by calculating the probability. 

Is there any other way? What kind of simulation can we have? We can’t 
expect to compute the probability of configurations for a probabilistic 
theory. But the other way to simulate a probabilistic nature, which I'll call 
QU for the moment, might still be to simulate the probabilistic nature by a 
computer C€ which itself is probabilistic, in which you always randomize the 
last two digit’s of every number, or you do something terrible to it. So it 
becomes what I’ll call a probabilistic computer, in which the output is not a 
unique function of the input. And then you try to work it out so that it 
simulates nature in this sense: that C goes from some state— initial state if 
you like—to some final state with the same probability that 9U goes from 
the corresponding initial state to the corresponding final state. Of course 
when you set up the machine and let nature do it, the imitator will not do 
the same thing, it only does it with the same probability. Is that no good? 
No it’s O.K. How do you know what the probability is? You see, nature’s 
unpredictable; how do you expect to predict it with a computer? You can’t, 
—it’s unpredictable if it’s probabilistic. But what you really do in a 
probabilistic system is repeat the experiment in nature a large number of 
times. If you repeat the same experiment in the computer a large number of 
times (and that doesn’t take any more time than it does to do the same thing 
in nature of course), it will give the frequency of a given final state 
proportional to the number of times, with approximately the same rate (plus 
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or minus the square root of n and all that) as it happens in nature. In other 
words, we could imagine and be perfectly happy, I think, with a probabilis- 
tic simulator of a probabilistic nature, in which the machine doesn’t exactly 
do what nature does, but if you repeated a particular type of experiment a 
sufficient number of times to determine nature’s probability, then you did 
the corresponding experiment on the computer, you’d get the corresponding 
probability with the corresponding accuracy (with the same kind of accu- 
racy of statistics). 

So let us now think about the characteristics of a local probabilistic 
computer, because I’}] see if I can imitate nature with that (by “nature” I’m 
now going to mean quantum mechanics). One of the characteristics is that 
you can determine how it behaves in a local region by simply disregarding 
what it’s doing in all other regions, For example, suppose there are variables 
in the system that describe the whole world (x,,x,)—the variables x, 
yow’re interested in, they’re “around here”; x, are the whole result of the 
world. If you want to know the probability that something around here is 
happening, you would have to get that by integrating the total probability of 
all kinds of possibilities over x,. If we had computed this probability, we 
would still have to do the integration 


P,(x,) = [P(x4.xp)dxp 


which is a hard job! But if we have imitated the probability, it’s very simple 
to do it: you don’t have to do anything to do the integration, you simply 
disregard what the values of x, are, you just look at the region x,. And 
therefore it does have the characteristic of nature: if it’s local, you can find 
out what’s happening in a region not by integrating or doing an extra 
operation, but merely by disregarding what happens elsewhere, which is no 
operation, nothing at all. 

The other aspect that I want to emphasize is that the equations will 
have a form, no doubt, something like the following. Let each point 
i=1,2,...,N in space be in a state s, chosen from a small state set (the size 
of this set should be reasonable, say, up to 2°). And let the probability to 
find some configuration {s,;} (a set of values of the state s, at each point /) 
be some number P({s;}). It satisfies an equation such that at each jump in 
time 


Pres((s})= B [THm(silsjosi--.)] (C5) 


(sy! 


where m(s;|s;,5;...) is the probability that we move to state s, at point i 
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when the neighbors have values s‘,s,..., where j,k etc. are points in the 
neighborhood of i. As j moves far from /, m becomes ever less sensitive to 
s’,. At each change the state at a particular point / will move from what it 
was to a State s with a probability m that depends only upon the states of 
the neighborhood (which may be so defined as to include the point / itself). 
This gives the probability of making a transition. It’s the same as in a 
cellular automaton; only, instead of its being definite, it’s a probability. Tell 
me the environment, and I’]] tell you the probability after a next moment of 
time that this point is at state s. And that’s the way it’s going to work, okay? 
So you get a mathematical equation of this kind of form, 

Now I explicitly go to the question of how we can simulate with a 
computer—a universal automaton or something—the quantum-mechanical 
effects. (The usual formulation is that quantum mechanics has some sort of 
a differential equation for a function y.) If you have a single particle, y is a 
function of x and 1, and this differential equation could be simulated just 
like my probabilistic equation was before. That would be all right and one 
has seen people make little computers which simulate the Schroedinger 
equation for a single particle. But the full description of quantum mechanics 
for a large system with R particles is given by a function ¥(x,,X7,...,Xp,t) 
which we call the amplitude to find the particles x,,...,x,, and therefore, 
because it has too many variables, it cannot be simulated with a normal 
computer with a number of elements proportional to R or proportional to 
N. We had the same troubles with the probability in classical physics. And 
therefore, the problem is, how can we simulate the quantum mechanics? 
There are two ways that we can go about it. We can give up on our rule 
about what the computer was, we can say: Let the computer itself be built 
of quantum mechanical elements which obey quantum mechanical laws. Or 
we can turn the other way and say: Let the computer still be the same kind 
that we thought of before—a logical, universal automaton; can we imitate 
this situation? And I’m going to separate my talk here, for it branches into 
two parts. 


4. QUANTUM COMPUTERS— UNIVERSAL QUANTUM 
SIMULATORS 


The first branch, one you might call a side-remark, is, Can you do it 
with a new kind of computer—a quantum computer? (I'll come back to the 
other branch in a moment.) Now it turns out, as far as I can tell, that you 
can simulate this with a quantum system, with quantum computer elements. 
It’s not a Turing machine, but a machine of a different kind. If we disregard 
the continuity of space and make it discrete, and so on, as an approximation 
(the same way as we allowed ourselves in the classical case), it does seem to 
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be true that all the various field theories have the same kind of behavior, 
and can be simulated in every way, apparently, with little latticeworks of 
spins and other things. It’s been noted time and time again that the 
phenomena of field theory (if the world is made in a discrete lattice) are well 
imitated by many phenomena in solid state theory (which is simply the 
analysis of a latticework of crystal atoms, and in the case of the kind of 
solid state I mean each atom is just a point which has numbers associated 
with it, with quantum-mechanical rules). For example, the spin waves in a 
spin lattice imitating Bose-particles in the field theory. I therefore believe 
it’s true that with a suitable class of quantum machines you could imitate 
any quantum system, including the physical world. But I don’t know 
whether the general theory of this intersimulation of quantum systems has 
ever been worked out, and so I present that as another interesting problem: 
to work out the classes of different kinds of quantum mechanical systems 
which are really intersimulatable— which are equivalent—as has been done 
in the case of classical computers. It has been found that there is a kind of 
universal computer that can do anything, and it doesn’t make much 
difference specifically how it’s designed. The same way we should try to find 
out what kinds of quantum mechanical systems are mutually intersimulata- 
ble, and try to find a specific class, or a character of that class which will 
simulate everything. What, in other words, is the universal quantum simula- 
tor? (assuming this discretization of space and time). If you had discrete 
quantum systems, what other discrete quantum systems are exact imitators 
of it, and is there a class against which everything can be matched? I believe 
it’s rather simple to answer that question and to find the class, but I just 
haven’t done it. 

Suppose that we try the following guess: that every finite quantum 
mechanical system can be described exactly, imitated exactly, by supposing 
that we have another system such that at each point in space-time this 
system has only two possible base states. Either that point is occupied, or 
unoccupied—those are the two states. The mathematics of the quantum 
mechanical operators associated with that point would be very simple. 


OCC UN 
a= ANNIHILATE=occ] 0 0 =4(0,—i0,) 
UN ] 0 


a*=CREATE = 0 | = }(0, + io,) 
0 0 
n== NUMBER = 1 0 =a*a=4(1+¢,) 
0 0 
1=IDENTITY = 1 0 
0 | 
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There would be an operator a which annihilates if the point is occupied 
—it changes it to unoccupied. There is a conjugate operator a* which does 
the opposite: if it’s unoccupied, it occupies it. There’s another operator n 
called the number to ask, Is something there? The little matrices tell you 
what they do. If it’s there, n gets a one and leaves it alone, if it’s not there, 
nothing happens. That’s mathematically equivalent to the product of the 
other two, as a matter of fact. And then there’s the identity, 1, which we 
always have to put in there to complete our mathematics—it doesn’t do a 
damn thing! 

By the way, on the right-hand side of the above formulas the same 
operators are written in terms of matrices that most physicists find more 
convenient, because they are Hermitian, and that seems to make it easier for 
them. They have invented another set of matrices, the Pauli o matrices: 


ah, 2% sf0-4 _(0 -i SP 66 
a= () a. ox (‘ HE o i a , € : 


And these are called spin —spin one-half—so sometimes people say you’re 
talking about a spin-one-half lattice. 

The question is, if we wrote a Hamiltonian which involved only these 
operators, locally coupled to corresponding operators on the other space-time 
points, could we imitate every quantum mechanical system which is discrete 
and has a finite number of degrees of freedom? I know, almost certainly, 
that we could do that for any quantum mechanical system which involves 
Bose particles. I’m not sure whether Fermi particles could be described by 
such a system. So I leave that open. Well, that’s an example of what I meant 
by a general quantum mechanical simulator. I’m not sure that it’s sufficient, 
because I’m not sure that it takes care of Fermi particles. 


5. CAN QUANTUM SYSTEMS BE PROBABILISTICALLY 
SIMULATED BY A CLASSICAL COMPUTER? 


Now the next question that I would like to bring up is, of course, the 
interesting one, i.e., Can a quantum system be probabilistically simulated by 
a classical (probabilistic, I’d assume) universal computer? In other words, a 
computer which will give the same probabilities as the quantum system 
does. If you take the computer to be the classical kind I’ve described so far, 
(not the quantum kind described in the last section) and there’re no changes 
in any laws, and there’s no hocus-pocus, the answer is certainly, No! This is 
called the hidden-variable problem: it is impossible to represent the results 
of quantum mechanics with a classical universal device. To learn a little bit 
about it, I say let us try to put the quantum equations in a form as close as 
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possible to classical equations so that we can see what the difficulty is and 
what happens. Well, first of all we can’t simulate y in the normal way. As 
I’ve explained already, there’re too many variables. Our only hope is that 
we’re going to simulate probabilities, that we’re going to have our computer 
do things with the same probability as we observe in nature, as calculated 
by the quantum mechanical system. Can you make a cellular automaton, or 
something, imitate with the same probability what nature does, where I’m 
going to suppose that quantum mechanics is correct, or at least after I 
discretize space and time it’s correct, and see if I can do it. I must point out 
that you must directly generate the probabilities, the results, with the correct 
quantum probability. Directly, because we have no way to store all the 
numbers, we have to just imitate the phenomenon directly. 

It turns out then that another thing, rather than the wave function, a 
thing called the density matrix, is much more useful for this. It’s not so 
useful as far as the mathematical equations are concerned, since it’s more 
complicated than the equations for y, but I’m not going to worry about 
mathematical complications, or which is the easiest way to calculate, be- 
cause with computers we don’t have to be so careful to do it the very easiest 
way. And so with a slight increase in the complexity of the equations (and 
not very much increase) I turn to the density matrix, which for a single 
particle of coordinate x in a pure state of wave function Y(x) is 


p(x, x)= ¥*(x)o(x’) 


This has a special property that is a function of two coordinates x, x’. The 
presence of two quantities x and x’ associated with each coordinate is 
analogous to the fact that in classical mechanics you have to have two 
variables to describe the state, x and x, States are described by a second-order 
device, with two informations (“‘position” and “velocity”). So we have to 
have two pieces of information associated with a particle, analogous to the 
classical situation, in order to describe configurations. (I’ve written the 
density matrix for one particle, but of course there’s the analogous thing for 
R particles, a function of 2R variables). 

This quantity has many of the mathematical properties of a probability. 
For example if a state ¥(x) is not certain but is y, with the probability p, 
then the density matrix is the appropriate weighted sum of the matrix for 
each state a: 


(x, x)= D paba* (x) a(x’). 


A quantity which has properties even more similar to classical probabilities 
is the Wigner function, a simple reexpression of the density matrix; for a 
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single particle 


w(x, p)= fo(x+2Z.x- 2) ery 


We shall be emphasizing their similarity and shall call it “probability” in 
quotes instead of Wigner function. Watch these quotes carefully, when they 
are absent we mean the real probability. If “probability” had all the 
mathematical properties of a probability we could remove the quotes and 
simulate it. W(x, p) is the “probability” that the particle has position x and 
momentum p (per dx and dp). What properties does it have that are 
analogous to an ordinary probability? 

It has the property that if there are many variables and you want to 
know the “probabilities” associated with a finite region, you simply disre- 
gard the other variables (by integration). Furthermore the probability of 
finding a particle at x is {W(x,p)dp. If you can interpret W as a 
probability of finding x and p, this would be an expected equation. Likewise 
the probability of p would be expected to be f{W(x, p)dx. These two 
equations are correct, and therefore you would hope that maybe W(x, p) is 
the probability of finding x and p. And the question then is can we make a 
device which simulates this W? Because then it would work fine. 

Since the quantum systems I noted were best represented by spin 
one-half (occupied versus unoccupied or spin one-half is the same thing), I 
tried to do the same thing for spin one-half objects, and it’s rather easy to 
do. Although before one object only had two states, occupied and unoc- 
cupied, the full description—in order to develop things as a function of time 
—requires twice as many variables, which mean two slots at each point 
which are occupied or unoccupied (denoted by + and — in what follows), 
analogous to the x and xX, or the x and p. So you can find four numbers, 
four “probabilities” {f.., f,—, f-., f-_} which act just like, and I have 
to explain why they’re not exactly like, but they act just like, probabilities to 
find things in the state in which both symbols are up, one’s up and one’s 
down, and so on. For example, the sum f,, +f, +/f_.+/__ of the 
four “probabilities” is 1. You’ll remember that one object now is going to 
have two indices, two plus/minus indices, or two ones and zeros at each 
point, although the quantum system had only one. For example, if you 
would like to know whether the first index is positive, the probability of that 
would be 

Prob(first index is+)=f,,+/f.- [spin z up] 


i.e., you don’t care about the second index. The probability that the first 
index is negative is 


Prob(first index is —)=f_,+f-_, [spin z down] 
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These two formulas are exactly correct in quantum mechanics. You see I’m 
hedging on whether or not “probability” f can really be a probability 
without quotes. But when | write probability without quotes on the left-hand 
side I’m not hedging; that really is the quantum mechanical probability. It’s 
interpreted perfectly fine here. Likewise the probability that the second 
index is positive can be obtained by finding 


Prob(second index is +)=f..+f_. [spin x up} 
and likewise 
Prob(second index is —-)=f,.+f.. [spin x down] 


You could also ask other questions about the system. You might like to 
know, What is the probability that both indices are positive? You'll get in 
trouble. But you could ask other questions that you won’t get in trouble 
with, and that get correct physical answers. You can ask, for example, what 
is the probability that the two indices are the same? That would be 


Prob(match)= f, . + f__ [spin y up] 


Or the probability that there’s no match between the indices, that they’re 
different, 


Prob(no match) = f, _ + f_ 4 [spin y down] 


All perfectly all right. All these probabilities are correct and make sense, 
and have a precise meaning in the spin model, shown in the square brackets 
above. There are other “probability” combinations, other linear combina- 
tions of these f’s which also make physically sensible probabilities, but I 
won’t go into those now. There are other linear combinations that you can 
ask questions about, but you don’t seem to be able to ask questions about 
an individual f. 


6. NEGATIVE PROBABILITIES 


Now, for many interacting spins on a lattice we can give a “probability” 
(the quotes remind us that there is still a question about whether it’s a 
probability) for correlated possibilities: 


FS 383.8u) (Seta = 7) 
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Next, if I look for the quantum mechanical equation which tells me what the 
changes of F are with time, they are exactly of the form that I wrote above 
for the classical theory: 


Fai((s))= & [IM (sisj.si--)] (5) 


{s‘} 


but now we have F instead of P. The M(s,|s), 5;...) would appear to be 
interpreted as the “probability” per unit time, or per time jump, that the 
state at / turns into s, when the neighbors are in configuration s’. If you can 
invent a probability M like that, you write the equations for it according to 
normal logic, those are the correct equations, the real, correct, quantum 
mechanical equations for this F, and therefore you’d say, Okay, so I can 
imitate it with a probabilistic computer! 

There’s only one thing wrong. These equations unfortunately cannot be 
so interpreted on the basis of the so-called “probability”, or this probabilis- 
tic computer can’t simulate them, because the F is not necessarily positive. 
Sometimes it’s negative! The M, the “probability” (so-called) of moving 
from one condition to another is itself not positive; if I had gone all the way 
back to the f for a single object, it again is not necessarily positive. 

An example of possibilities here are 


The sum f,., + f,_ is 0.5, that’s 50% chance of finding the first index 
positive. The probability of finding the first index negative is the sum 
f+ f_.4 which is also 50%. The probability of finding the second index 
positive is the sum f,, + f_., which is nine tenths, the probability of 
finding it negative is f, _ + f__ which is one-tenth, perfectly alright, it’s 
either plus or minus. The probability that they match is eight-tenths, the 
probability that they mismatch is plus two-tenths; every physical probabil- 
ity Comes out positive. But the original f’s are not positive, and therein lies 
the great difficulty. The only difference between a probabilistic classical 
world and the equations of the quantum world is that somehow or other it 
appears as if the probabilities would have to go negative, and that we do not 
know, as far as I know, how to simulate. Okay, that’s the fundamental 
problem. I don’t know the answer to it, but I wanted to explain that if I try 
my best to make the equations look as near as possible to what would be 
imitable by a classical probabilistic computer, I get into trouble. 
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7. POLARIZATION OF PHOTONS—TWO-STATES SYSTEMS 


I would like to show you why such minus signs cannot be avoided, or 
at least that you have some sort of difficulty. You probably have all heard 
this example of the Einstein-Podolsky-Rosen paradox, but I will explain this 
little example of a physical experiment which can be done, and which has 
been done, which does give the answers quantum theory predicts, and the 
answers are really right, there’s no mistake, if you do the experiment, it 
actually comes out. And I’m going to use the example of polarizations of 
photons, which is an example of a two-state system. When a photon comes, 
you Can say it’s either x polarized or y polarized. You can find that out by 
putting in a piece of calcite, and the photon goes through the calcite either 
out in one direction, or out in another—actually slightly separated, and 
then you put in some mirrors, that’s not important. You get two beams, two 
places out, where the photon can go. (See Figure 2.) 

If you put a polarized photon in, then it will go to one beam called the 
ordinary ray, cr another, the extraordinary one. If you put detectors there 
you find that each photon that you put in, it either comes out in one or the 
other 100% of the time, and not half and half. You either find a photon in 
one or the other. The probability of finding it in the ordinary ray plus the 
probability of finding it in the extraordinary ray is always 1—you have to 
have that rule. That works. And further, it’s never found at both detectors. 
(If you might have put two photons in, you could get that, but you cut the 
intensity down—it’s a technical thing, you don’t find them in both detec- 
tors.) 

Now the next experiment: Separation into 4 polarized beams (see 
Figure 3). You put two calcites in a row so that their axes have a relative 
angle ¢, I happen to have drawn the second calcite in two positions, but it 
doesn’t make a difference if you use the same piece or not, as you care. Take 
the ordinary ray from one and put it through another piece of calcite and 
look at its ordinary ray, which I'll call the ordinary—ordinary (O-0O) ray, or 
look at its extraordinary ray, I have the ordinary-extraordinary (O- E) ray. 
And then the extraordinary ray from the first one comes out as the E~O 
ray, and then there’s an E-E ray, alright. Now you can ask what happens. 
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Fig. 3. 


You’ll find the following. When a photon comes in, you always find that only 
one of the four counters goes off. 


If the photon is O from the first calcite, then the second calcite gives 
O-O with probability cos? @ or O-E with the complementary probability 
1—cos'¢ =sin’ ¢. Likewise an E photon gives a E-O with the probability 
sin’ @ or an E-E with the probability cos? 9. 


8. TWO-PHOTON CORRELATION EXPERIMENT 


Let us turn now to the two photon correlation experiment (see 
Figure 4). 

What can happen is that an atom emits two photons in opposite 
direction (e.g., the 3s +2p-— 1s transition in the H atom). They are ob- 
served simultaneously (say, by you and by me) through two calcites set at ¢, 
and ¢, to the vertical. Quantum theory and experiment agree that the 
probability P,, that both of us detect an ordinary photon is 


Poo = 3008" ($, — 1) 

The probability P-, that we both observe an extraordinary ray is the same 
Pre = 3008" ($) — $)) 

The probability P), that I find O and you find E is 


Pog = tin? ($, — 9) 
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and finally the probability P-, that I measure E and you measure O is 
mare: 
Pro = 38in" ($2 ~ ;) 


Notice that you can always predict, from your own measurement, what I 
shall get, O or £. For any axis ¢, that I chose, just set your axis ¢, to ¢,, 
then 


Por = Pro =0 


and I must get whatever you get. 

Let us see now how it would have to be for a local probabilistic 
computer. Photon | must be in some condition @ with the probability f,(¢,), 
that determines it to go through as an ordinary ray [the probability it would 
pass as E is |— f,(¢,)]. Likewise photon 2 will be in a condition B with 
probability ga(¢,). If pg is the conjoint probability to find the condition 
pair a, B, the probability P,, that both of us observe O rays is 


Pool $1: $2)= % Papel )éa( #2) 2 Pap =| 


likewise 


Por( $1.2) = % Poal — fel b1))Bp(¢2) ete. 


The conditions a determine how the photons go. There’s some kind of 
correlation of the conditions. Such a formula cannot reproduce the quantum 
results above for any Pygs fa(>1), 8g($2) if they are real probabilities— that 
is all positive, although it is easy if they are “probabilities’—negative for 
some conditions or angles. We now analyze why that is so. 

I don’t know what kinds of conditions they are, but for any condition 
the probability f,(¢) of its being extraordinary or ordinary in any direction 
must be either one or zero. Otherwise you couldn’t predict it on the other 
side. You would be unable to predict with certainty what I was going to get, 
unless, every time the photon comes here, which way it’s going to go is 
absolutely determined. Therefore, whatever condition the photon is in, there 
is some hidden inside variable that’s going to determine whether it’s going 
to be ordinary or extraordinary. This determination is done deterministi- 
cally, not probabilistically; otherwise we can’t explain the fact that you 
could predict what I was going to get exactly. So let us suppose that 
something like this happens. Suppose we discuss results just for angles 
which are multiples of 30°. 
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On each diagram (Figure 5) are the angles 0°, 30°, 60°, 90°, 120°, and 
150°. A particle comes out to me, and it’s in some sort of state, so what it’s 
going to give for 0°, for 30°, etc. are all predicted—determined—by the 
state. Let us say that in a particular state that is set up the prediction for 0° 
is that it'll be extraordinary (black dot), for 30° it’s also extraordinary, for 
60° it’s ordinary (white dot), and so on (Figure 5a). By the way, the 
outcomes are complements of each other at right angles, because, remember, 
it’s always either extraordinary or ordinary; so if you turn 90°, what used to 
be an ordinary ray becomes the extraordinary ray. Therefore, whatever 
condition it’s in, it has some predictive pattern in which you either have a 
prediction of ordinary or of extraordinary—three and three—because at 
right angles they’re not the same color. Likewise the particle that comes to 
you when they’re separated must have the same pattern because you can 
determine what I’m going to get by measuring yours. Whatever circum- 
stances come out, the patterns must be the same. So, if I want to know, Am 
I going to get white at 60°? You just measure at 60°, and you’ll find white, 
and therefore you’ll predict white, or ordinary, for me. Now each time we 
do the experiment the pattern may not be the same. Every time we make a 
pair of photons, repeating this experiment again and again, it doesn’t have 
to be the same as Figure 5a. Let’s assume that the next time the experiment 
my photon will be O or E for each angle as in Figure 5c. Then your pattern 
looks like Figure 5d. But whatever it is, your pattern has to be my pattern 
exactly— otherwise you couldn’t predict what I was going to get exactly by 
measuring the corresponding angle. And so on. Each time we do the 
experiment, we get different patterns; and it’s easy: there are just six dots 
and three of them are white, and you chase them around different way—ev- 
erything can happen. If we measure at the same angle, we always find that 
with this kind of arrangement we would get the same result. 

Now suppose we measure at $, — ¢, = 30°, and ask, With what proba- 
bility do we get the same result? Let’s first try this example here (Figure 
5a,5b). With what probability would we get the same result, that they’re 


one pair of photons 


ee 10° 
Ey 1so* 
+ 
co 
my photon @) your photon (b) 
next pair of photons 
my photon © your photon g) 


Fig. 5. 


Simulating Physics with Computers 485 


both white, or they’re both black? The thing comes out like this: suppose I 
say, After they come out, I’m going to choose a direction at random, I tell 
you to measure 30° to the right of that direction. Then whatever I get, you 
would get something different if the neighbors were different. (We would 
get the same if the neighbors were the same.) What is the chance that you 
get the same result as me? The chance is the number of times that the 
neighbor is the same color. If you'll think a minute, you’ll find that two 
thirds of the time, in the case of Figure 5a, it’s the same color. The worst 
case would be black /white /black /white /black /white, and there the proba- 
bility of a match would be zero (Figure 5c,d). If you look at all eight 
possible distinct cases, you'll find that the biggest possible answer is 
two-thirds. You cannot arrange, in a classical kind of method like this, that 
the probability of agreement at 30° will be bigger than two-thirds. But the 
quantum mechanical formula predicts cos? 30° (or 3/4)—and experiments 
agree with this-—and therein lies the difficulty. 

That’s all. That’s the difficulty. That’s why quantum mechanics can’t 
seem to be imitable by a local classical computer. 

I’ve entertained myself always by squeezing the difficulty of quantum 
mechanics into a smaller and smaller place, so as to get more and more 
worried about this particular item. It seems to be almost ridiculous that you 
can squeeze it to a numerical question that one thing is bigger than another. 
But there you are—it is bigger than any logical argument can produce, if 
you have this kind of logic. Now, we say “this kind of logic;” what other 
possibilities are there? Perhaps there may be no possibilities, but perhaps 
there are. lts interesting to try to discuss the possibilities. I mentioned 
something about the possibility of time—of things being affected not just 
by the past, but also by the future, and therefore that our probabilities are 
in some sense “illusory.” We only have the information from the past, and 
we try to predict the next step, but in reality it depends upon the near future 
which we can’t get at, or something like that. A very interesting question is 
the origin of the probabilities in quantum mechanics. Another way of 
puttings things is this: we have an illusion that we can do any experiment 
that we want. We all, however, come from the same universe, have evolved 
with it, and don’t really have any “real” freedom. For we obey certain laws 
and have come from a certain past. Is it somehow that we are correlated to 
the experiments that we do, so that the apparent probabilities don’t look 
like they ought to look if you assume that they are random. There are all 
kinds of questions like this, and what I’m trying to do is to get you people 
who think about computer-simulation possibilities to pay a great deal of 
attention to this, to digest as well as possible the real answers of quantum 
mechanics, and see if you can’t invent a different point of view than the 
physicists have had to invent to describe this. In fact the physicists have no 
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good point of view. Somebody mumbled something about a many-world 
picture, and that many-world picture says that the wave function y is what’s 
real, and damn the torpedos if there are so many variables, N*. All these 
different worlds and every arrangement of configurations are al] there just 
like our arrangement of configurations, we just happen to be sitting in this 
one. It’s possible, but I’m not very happy with it. 

So, I would like to see if there’s some other way out, and I want to 
emphasize, or bring the question here, because the discovery of computers 
and the thinking about computers has turned out to be extremely useful in 
many branches of human reasoning. For instance, we never really under- 
stood how lousy our understanding of languages was, the theory of gram- 
mar and all that stuff, until we tried to make a computer which would be 
able to understand language. We tried to learn a great deal about psychol- 
ogy by trying to understand how computers work. There are interesting 
philosophical questions about reasoning, and relationship, observation, and 
measurement and so on, which computers have stimulated us to think about 
anew, with new types of thinking. And all I was doing was hoping that the 
computer-type of thinking would give us some new ideas, if any are really 
needed. I don’t know, maybe physics is absolutely OK the way it is. The 
program that Fredkin is always pushing, about trying to find a computer 
simulation of physics, seem to me to be an excellent program to follow out. 
He and I have had wonderful, intense, and interminable arguments, and my 
argument is always that the real use of it would be with quantum mechanics, 
and therefore full attention and acceptance of the quantum mechanical 
phenomena—the challenge of explaining quantum mechanical phenomena 
—has to be put into the argument, and therefore these phenomena have to 
be understood very well in analyzing the situation. And I’m not happy with 
all the analyses that go with just the classical theory, because nature isn’t 
classical, dammit, and if you want to make a simulation of nature, you'd 
better make it quantum mechanical, and by golly it’s a wonderful problem, 
because it doesn’t look so easy. Thank you. 


9. DISCUSSION 


Question: Just to interpret, you spoke first of the probability of A given 
B, versus the probability of A and B jointly—that’s the probability of one 
observer seeing the result, assigning a probability to the other; and then you 
brought up the paradox of the quantum mechanical result being 3/4, and 
this being 2/3. Are those really the same probabilities? Isn’t one a joint 
probability, and the other a conditional one? 

Answer: No, they are the same. Po, is the joint probability that both you 
and I observe an ordinary ray, and P,, is the joint probability for two 
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extraordinary rays. The probability that our observations match is 
Poo + Peg =c0s?30° =3/4 


Question: Does it in some sense depend upon an assumption as to how 
much information is accessible from the photon, or from the particle? And 
second, to take your question of prediction, your comment about predicting, 
is in Some Sense reminiscent of the philosophical question, Is there any 
meaning to the question of whether there is free will or predestination? 
namely, the correlation between the observer and the experiment, and the 
question there is, Is it possible to construct a test in which the prediction 
could be reported to the observer, or instead, has the ability to represent 
information already been used up? And I suspect that you may have already 
used up all the information so that prediction lies outside the range of the 
theory. 

Answer: All these things I don’t understand; deep questions, profound 
questions. However physicists have a kind of a dopy way of avoiding all of 
these things. They simply say, now look, friend, you take a pair of counters 
and you put them on the side of your calcite and you count how many times 
you get this stuff, and it comes out 75% of the time. Then you go and you 
say, Now can I imitate that with a device which is going to produce the 
same results, and which will operate locally, and you try to invent some 
kind of way of doing that, and if you do it in the ordinary way of thinking, 
you find that you can’t get there with the same probability. Therefore some 
new kind of thinking is necessary, but physicists, being kind of dull minded, 
only look at nature, and don’t know how to think in these new ways. 

Question: At the beginning of your talk, you talked about discretizing 
various things in order to go about doing a real computation of physics. 
And yet it seems to me that there are some differences between things like 
space and time, and probability that might exist at some place, or energy, or 
some field value. Do you see any reason to distinguish between quantization 
or discretizing of space and time, versus discretizing any of the specific 
parameters or values that might exist? 

Answer: I would like to make a few comments. You said quantizing or 
discretizing. That’s very dangerous. Quantum theory and quantizing is a 
very specific type of theory. Discretizing is the right word. Quantizing is a 
different kind of mathematics. If we talk about discretizing ... of course I 
pointed out that we’re going to have to change the laws of physics. Because 
the laws of physics as written now have, in the classical limit, a continuous 
variable everywhere, space and time. If, for example, in your theory you 
were going to have an electric field, then the electric field could not have (if 
it’s going to be imitable, computable by a finite number of elements) an 
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infinite number of possible values, it'd have to be digitized. You might be 
able to get away with a theory by redescribing things without an electric 
field, but supposing for a moment that you’ve discovered that you can’t do 
that and you want to describe it with an electric field, then you would have 
to say that, for example, when fields are smaller than a certain amount, they 
aren’t there at all, or something. And those are very interesting problems, 
but unfortunately they’re not good problems for classical physics because if 
you take the example of a star a hundred light years away, and it makes a 
wave which comes to us, and it gets weaker, and weaker, and weaker, and 
weaker, the electric field’s going down, down, down, how low can we 
measure? You put a counter out there and you find “clunk,” and nothing 
happens for a while, “clunk,” and nothing happens for a while. It’s not 
discretized at all, you never can measure such a tiny field, you don’t find a 
tiny field, you don’t have to imitate such a tiny field, because the world that 
you’re trying to imitate, the physical world, is not the classical world, and it 
behaves differently. So the particular example of discretizing the electric 
field, is a problem which I would not see, as a physicist, as fundamentally 
difficult, because it will just mean that your field has gotten so small that I 
had better be using quantum mechanics anyway, and so you’ve got the 
wrong equations, and so you did the wrong problem! That’s how I would 
answer that. Because you see, if you would imagine that the electric field is 
coming out of some ‘ones’ or something, the lowest you could get would be 
a full one, but that’s what we see, you get a full photon. All these things 
suggest that it’s really true, somehow, that the physical world is represent- 
able in a discretized way, because every time you get into a bind like this, 
you discover that the experiment does just what’s necessary to escape the 
trouble that would come if the electric field went to zero, or you’d never be 
able to see a star beyond a certain distance, because the field would have 
gotten below the number of digits that your world can carry. 
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The Computing Machines in the Future 


Nishina Memorial Lecture 


given at Gakushuin, Tokyo, August 9, 1985 


Richard P. Feynman 


California Institute of Technology, Pasadena, California, U.S.A. 


It's a great pleasure and an honor to be here as a speaker in 
memorial for a scientist that I have respected and admired as much as 
Prof. Nishina. To come to Japan and talk about computers is like 
giving a sermon to Buddha. But I have been thinking about computers 
and this is the only subject I could think of when invited to talk. 


The first thing I would like to say is what I am not going to 
talk about. I want to talk about the future computing machines. But 
the most important possible developments in the future, are things 
that I will not speak about. For example, there is a great deal of 
work to try to develop smarter machines, machines which have a better 
relationship with the humans so that input and output can be made with 
less effort than the complex programing that's necessary today. This 
goes under the name often of artificial intelligence, but I don't like 


that name, Perhaps the unintelligent machines can do even better than 


the intelligent ones. Another problem is the standardization of 
programing languages. There are too many languages today, and it 
would be a good idea to choose just one. (I hesitate to mention that 


in Japan, for what will happen will be that there will simply be more 
standard languages; you already have four ways of writing now and 
attempts to standardize anything here result apparently in more 
standards and not fewer.) Another interesting future problem that is 
worth working on but I will not talk about, is automatic debugging 
programs; debugging means to fix errors in a program or in a machine, 


It is Surprisingly difficult to debug programs as they get more 
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complicated. Another direction of improvement is to make physical 
machines three dimensional instead of all on a surface of a chip. 
That can be done in stages instead of all at once; you can have 
several layers and then many more layers as the time goes on. Another 
important device would be a way of detecting automatically defective 
elements on a chip, then this chip itself automatically rewiring 
itself so as to avoid the defective elements. At the present time 
when we try to make big chips there are flaws, bad spots in the chips, 
and we throw the whole chip away. But of course if we could make it 
so that we could use the part of the chip that was effective, it would 
be much more efficient. I mention these things to try to tell you 
that I am aware of what the real problems are for future machines. 
But what I want to talk about is simple, just some small technical, 
physically good things that can be done in principle according to the 
physical laws; I would like in other words to discuss the machinery 
and not the way we use the machines. 


I will talk about some technical possibilities for making 


machines. There will be three topics really. One is’ parallel 
processing machines which is something of the very near future, almost 
present, that is being developed now. Further in the future are 


questions of the energy consumption of machines which seems at the 
moment to be a limitation, but really isn't. Finally I will talk 
about the size; it is always better to make the machines smaller, and 
the question is how much smaller is it still possible to make machines 
according to the laws of Nature, in principle. I will not discuss 
which and what of these things will actually appear in the future. 
That depends on economic problems and social problems and I am not 


going to try to guess at those, 


1. Parallel Computers 


First about parallel programing, parallel computers, rather. 
Almost all the present computers, conventional computers, work ona 


layout or an architecture invented by von Neumann, in which there is a 


very large memory that stores all the information, and one central 
location that does simple calculations. We take a number from this 
place in the memory and a number from that place in the memory, send 
the two to the central arithmetical unit to add them and then send the 
answer to-some other place in the memory. There is, therefore, 
effectively one central processor which is working very very fast and 
very hard while the whole memory sits out there like a vast filing 
cabinet of cards which are very rarely used. It is obvious that if 
there were more processors working at the same time we ought to be 
able to do calculations faster. But the problem is that some one who 
might be using one processor may be using some information from the 
memory that another one needs, and it gets very confusing. And so it 
has been said that it is very difficult to work many processors in 
parallel. Some steps in that direction have been taken in the larger 
conventional machines, what they call "vector processors". When 
sometimes you want to do exactly the same step on many different items 
you can do that perhaps at the same time. The ordinary hope is that 
the regular program can be written, and then an interpreter will 
discover automatically when it is useful to use this vector 
possibility. That idea is used in the Cray and in the super-computers 
in Japan. Another plan is to take what is effectively a large number 
of relatively simple (but not very simple) computers, and connect them 
all together in some pattern. Then they can all work on a part of the 
problem. Each one is really an independent computer, and they will 
transfer information to each other as one or another needs it. This 
kind of a scheme is in a machine for example called Cosmic Cube, and 
is one of the possibilities; many people are making such machines. 

Another plan is to distribute very large numbers of very simple 
central processors all over the memory. Each one deals with just a 
small part of the memory and there is an elaborate system of 
interconnections between them. An example of such a machine is the 
Connection Machine made at M.I.T. It has 64,000 processors and a 
system of routing in which every 16 can talk to any other 16 and thus 
4000 routing connection possibilities. It would appear that 


scientific questions like the propagation of waves in some material 
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might be very easily handled by parallel processing, because what 
happens in this part of space at a moment can be worked out locally 
and only the pressures and the stresses from the neighbor needs to be 
known for each section can be worked out at the same time, and 
communicate boundary conditions across. That's why this type of design 
is built for such a thing. But it has turned out that very large 
number of problems of all kinds can be dealt with in parallel. As 
long as the problem is big enough so that a lot of calculating has to 
be done, it turns out that a parallel computation can speed this up 
enormously, not just scientific problems. 

And what happened to the prejudice of 2 years ago, which was that 
the parallel programing is difficult? It turns out that what was 
difficult, and almost impossible, is to take an ordinary program and 
automatically figure out how to use the parallel computation 
effectively on that program. Instead, one must start all over again 
with the problem, appreciating that we have parallel possibility of 
calculation, and rewrite the program completely with a new attitude to 
what is inside the machine. It is not possible to effectively use the 
old programs. They must be rewritten. That is a great disadvantage 
to most industrial applications and has met with considerable 
resistance. But, the big programs belong usually to scientists or 
others, unofficial intelligent programmers who love computer science 
and are willing to start all over again and rewrite the program if 
they can make it more efficient. So what's going to happen is that the 
hard programs, vast big ones, will first be programed by experts in 
the new way, and then gradually everybody will have to come around, 
and more and more programs will be programed that way, and programmers 
will just have to learn how to do it. 


2. Reducing the Energy Loss 


The second topic I want to talk about is energy loss in 
computers. The fact that they must be cooled is the limitation 


apparently to the largest computers; a good deal of the effort is 


Now 


Fig. 1 


spent in cooling the machine. I would like to explain that this is 
simply a result of very poor engineering and is nothing fundamental at 
all. Inside the computer a bit of information is controlled by a wire 
which either has a voltage of one value or another value. It is 
called “one bit", and we have to change the voltage of the wire from 
one value to the other and have to put charge on or take charge off. 
I make an analogy with water: We have to fill a vessel with water and 
get one level or to empty it to get to the other level. It's just an 
analogy. If you like electricity better you can think more accurately 
electrically. What we do now is analogous, in the water case, to 
filling the vessel by pouring water in from a top level (Fig.1), and 
lowering the level by opening the valve at the bottom and letting it 
all run out. In both cases there is a loss of energy because of the 
drop of the water, suddenly, through a height say from top level where 
it comes in to the low bottom level when you start pouring it in to 
fill it up again. In the cases of voltage and charge, there occurs the 
same thing. 

It's like, as Mr. Bennett has explained, operating an automobile 
which has to start and stop by turning on the engine and putting on 
the brakes, turning on the engine and putting on the brakes; each time 


you lose power. Another way with a car would be to connect the wheels 
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to flywheels. Stop the car and speed up the flywheel saving the 


energy, which can then be reconnected to start the car again. The 
analogy electrically or in the water would be to have a U-shaped tube 
with a valve at the bottom in the center connecting the two arms of 
the U (Fig.2). When it is full here on the right but empty on the 
left with the valve closed, if we open that valve the water will slip 
out to the other side, and we close it just in time to catch it. Then 
when we want to go the other way we open the valve again and it slips 
to the other side and we catch it. There is some loss and it doesn't 
climb as high as it did before, but all we have to do is to put a 
little water in to correct the little loss, a much smaller energy loss 
than the direct fill method. But such a thing uses the inertia of the 
water and the analogue in the electricity, is inductance. However it 
is very difficult with the silicon transistors that we use today to 
make up inductance on the _ chips. So this is not particularly 
practical with the present technology. 

Another way would be to fill the tank by a supply which stays 
only a little bit above the level lifting the water supply in time as 
we fill it up (Fig.3), because then the dropping of water is always 


small during the entire effort. In the same way, we could use an 
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outlet to lower it but just taking off the top and lowering the tube, 
so that the heat loss would not appear at the position of the 
transistor, or would be small; it will depend on how high the distance 
is between the supply and the surface as we fill it up. This method 
corresponds to changing the voltage supply with time (Fig.4). So if 
we would use a time varying voltage supply, we could use this method. 
Of course, there is energy loss in the voltage supply, but that 
is all located in one place, that is simple, and there we can make 
one big inductance. This scheme is called "hot clocking", because the 
voltage supply operates also as the clock which times everything. And 
we don't need an extra clock signal to time the circuits as we do in 
conventional designs. 

Both of these last two devices use less energy if they go slower. 
If I try to move the water supply level too fast, the water in the 
tube doesn't keep up with it and I have a big drop. So to work I must 
go slowly. Again, the U-tube scheme will not work unless that central 
valve can open and close faster than the time it takes for the water 
in the U-tube to slip back and forth. So my devices are slower. I've 
saved an energy loss but I've made the devices slower. The energy 


loss multiplied by the time it takes for the circuit to operate is 
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constant. But nevertheless, this turns out’ to be very practical 
because the clock time is usually much larger than the circuit time 
for the transistors, and we can use that to decrease the energy. Also 
if we went, let us say, three times slower with our calculations, we 
could use one third the energy over three times the time, which is 
nine times less power that has to be dissipated. Maybe it is worth 
it. Maybe by redesigning using parallel computations or other 
devices, we can spend a little longer than we could do at maximum 
circuit speed, in order to make a larger machine that is practical and 
from which we could still get the energy out. 

For a transistor, the energy loss multiplied by the time it takes 


to operate is a product of several factors (Fig.5): (1) the thermal 


energy propotional to temperature, kT; (2) the length of the 
transistor between source and drain, divided by the velocity of the 
electrons inside (the thermal velocity /3kT/m ); (3) the length of the 


transistor in units of the mean free path for collisions of electrons 
in the transistor; and finally (4) the total number of the electrons 
that are inside the transistor when it operates. All of these numbers 
come out to tell us that the energy used in the transistor today is 
somewhere around a billion or ten billions or more times the thermal 


energy kT. When it switches we use that much energy. It is very 
large amount of energy. It is obviously a good idea to decrease the 
size of the transistor. We decrease the length between source and 


drain, and we can decrease the number of the electrons, and use much 
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less energy. It also turns out that a smaller transistor is much 
faster, because the electrons can cross it and make their decisions to 
switch faster. For every reason, it is a good idea to make the 
transistor smaller, and everybody is always trying to do that. 

But suppose we come to a circumstance in which the mean free path 
is longer than the size of the transistor, then we discover that the 
transistor doesn’t work right any more. It does not behave the way we 
expected. This reminds me, years ago there was something called the 
sound barrier. Airplanes cannot go faster than the speed of sound 
because, if you design them normally and try to put them in that 
speed, the propeller wouldn't work and the wings don't lift and 
nothing works correctly. Nevertheless, airplanes can go faster than 
the speed of sound. You just have to know what the right laws are 
under the right circumstances, and design the device with the correct 
laws. You cannot expect old designs to work in new circumstances. 
But new designs can work in new circumstances, and I assert that it is 
perfectly possible to make transistor systems, that is to say more 
correctly, switching systems, computing devices in which’ the 
dimensions are smaller than the mean free path. I speak of course in 
principle and I am not speaking about actual manufacture. Therefore, 


let us discuss what happens if we try to make the devices as small as 
possible. 


3. Reducing the Size 


So, my third topic is the size of computing elements and now I 
speak entirely theoretically. The first thing that you would worry 
about when things get very small, is Brownian motion; everything is 
shaking and nothing stays in place, and how can you control the 
circuits then? And if the circuits did work, it has a chance of 
accidentally jumping back. But, if we use two volts for the energy of 
this electric system which is what we ordinarily use, that is eighty 
times the thermal energy (kT=1/40 volt) and the chance that something 


jumps backward against 80 times thermal energy is e, the base of the 
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natural logarithm, to minus eighty power, or 10. What does that 


mean? If we had a billion transistors in a computer (which we don't 
have, we don't have that many at all), working all of them io. times 
a second, that is, tenth of a nanosecond switching perpetually, 
operating for 10? seconds, which is 30 years, the total number of 
switching operations in that machine is to"? and the chante of one of 
them going backward is only 16. "3 there will be no error produced by 
thermal oscillations whatsoever in 30 years. If you don't like that, 
use 2.5 volts and then it gets smaller. Long before that, the real 
failure will come when a cosmic ray accidentally goes through the 
transistor, and we don't have to be more perfect than that. 

However, much more is in fact possible and I would like to refer 
you to an article in a most recent Scientific American by Bennett and 
Landauer. It is possible to make a computer in which each element, 
each transistor, can go forward and accidentally reverse and still the 
computer will operate. All the operation in succession in the 
computer go forward or backward. The computation proceeds for a while 
this way and then it undoes itself, uncalculates, and then goes 
forward again and so on. If we just pull it along a little, we can 
make it go through and finish the calculation by making it justa 
little bit more likely that it goes forward than backward. 

It is known that all the computations can be made by putting 
together some simple elements like transistors; or, if we be more 
logically abstract, a thing for instance called NAND gate (NAND means 
NOT-AND). It has two "wires" in and one out (Fig.6). Forget the NOT 
first. What is AND? AND is: The output is 1 only if both input wires 
are l, otherwise the output is 0. NOT-AND means the opposite. The 
output wire reads 1 (i.e. has the voltage level corresponding to 1) 
unless both input wires read 1, if both input wires read 1 then the 
output wire reads 0 (i.e. has the voltage level corresponding to 0). 
Here is a little table of inputs and outputs. A and B are inputs and 
C is the output. Unless A and B are both 1, the output is 1 otherwise 
0. But such a device is irreversible. Information is lost. If I 
only know the output, I cannot recover the input. The device can't 


be expected to flip forward and then come back and compute 


NOT AND=NAND 


Not Reversible 
Information Lost 


Fig. 6 


correctly anymore. Because if we know for instance that the output is 
now 1, we don't know whether it came from A=0, B=1 or A=1, B=0 or A=0, 
B=0 and it cannot go back. Such a device is an irreversible gate. 

The great discovery of Bennett and, independently, of Fredkin is that 
it is possible to do computation with a diffferent kind of fundamental 
gate unit, a reversible gate unit. I have illustrated their idea -- 
with this unit which I could call a reversible NAND or whatever. It 
hag three inputs and three outputs (Fig.7). Of the outputs, two, A' 
and B', are the same as two of the inputs, A and B, but the third 
input works this way: C' is the same as C unless A and B are both 1. 

Then it changes whatever C is. For instance, if C is l it is 
changed to 0, if C is 0 it is changed to 1 only if both A and B are 
1. If you put two in succession, you see A and B will go 
through, and if C is not changed in both it stays the same or if C 
is changed twice it stays the same. So this gate reverses itself. 


No information has been lost. It is possible to discover what 
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Reversible Gate 
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B=B 

C=C unless A= Land B= 1 
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=NOTC, if A= land B= 1 
No Information Lost 


Some Force Needed to Push Calculation 
Fig. 7 Predominantly Forward 
Energy Loss - Time Used= Constant 


went in if you know what went out. 

A device made entirely with such gates will make calculations if 
everything moves forward, but if things go back and forth for a while 
and then eventually go forward enough it still operates correctly. I£ 
the things flip back and then go forward later it is still all right. 
It's very much the same as a particle in a gas which is bombarded by 
the atoms around it, usually goes nowhere, but with just a little 
pull, a little prejudice that makes a chance to move one way a little 
higher than the other way, the thing will slowly drift forward and 
reach from one end to the other, in spite of the Brownian motion that 
is made. So our computer will compute provided we apply a force of 
drift to pull the thing more likely across the calculation. Although 
it is not doing the calculation in a smooth way, but calculating like 
this, forward and backward, it eventually finishes the job. As with 


the particle in the gas, if we pull it very slightly, we lose very 


little energy, but it takes a long time to get to one side from the 
other. If we are in a hurry, and we pull hard, then we loose a lot of 
energy. And the same with this computer. If we are patient and go 
slowly, we can make the computer operate with practically no energy, 
even less than kT per step, any amount as small as you like if you 
have enough time. But if you are in a hurry, you must dissipate 
energy, and again it's true that the energy lost to pull the 
calculation forward to complete it multiplied by the time you are 
allowed to make the calculation is a constant. 

With these possibilities how small can we make a computer? How 
big must a number be? We all know we can write numbers in base 2 
as strings of "bits" each a one or a zero. But how small can I write? 
Surely only one atom is needed to be in one state or another to 
determine if it represents a one or a zero. And the next atom could 
be a one or a zero, so a little string of atoms are enough to hold a 
number, one atom for each bit. (Actually since an atom can be in more 
states than just two we could use even fewer atoms, but enough is 
little enough!) 

So now for intellectual entertainment we consider whether we 
could make a computer in which the bits writing is of atomic size, in 
which a bit is for example whether the spin in the atom is up for 1 or 
down for 0. And then our transistor changing the bits in different 
places would correspond to some interaction between some atoms, which 
will change their states. The simplest would be a kind of 3-atom 
interaction to be the fundamental element or gate in such a device. 
But again, it won't work right if we design it with the laws 
appropriate for large objects. We must use the new laws of physics, 
quantum mechanical laws, the laws that they are appropriate to atomic 
motion. And so we have to ask whether the principles of quantum 
mechanics permit an arrangements of atoms so small in number as a few 
times the number of gates in a computer that could still be put 
together and operate as a computer. This has been studied in 
principle, and such an arrangement has been found. The laws of quantum 
mechanics are reversible and therefore we must use the invention of 


reversible gates, that principle, that idea of Bennett and Fredkin, 


959 


13 


960 


14 


but we know that's alright now. When the quantum mechanical situation 
is studied it is found that quantum mechanics adds no further 
limitations to anything that Mr. Bennet has said from thermodynamic 
considerations. Of course there is a limitation, the practical 
limitation anyway, that the bits must be of the size of an atom and a 
transistor 3 or 4 atoms; the quantum mechanical gate I used has 3 
atoms. (I would not try to write my bits on to nuclei, I'll wait till 
the technological development reaches the atoms before I need to go 
any further!) That leads us just with (a) the Limitations in size to 
the size of atoms, (b) the energy requirements depending on the time 
as worked out by Bennett, (c) and the feature that I did not mention 
concerning the speed of light; we can't send the signals any faster 
than the speed of light. Those are the only physical limitations that 
I know on computers. 

If we make an atomic size computer, somehow, it would mean that 
the dimension, the linear dimension is a thousand to ten thousands 
times smaller than those very tiny chips that we have now. It means 
that the volume of the computer is 100 billionth, 19574 of the present 
volume, because the transistor is that much smaller Lory than the 
transistors that we make today. The energy requirement for a single 
switch is also about eleven orders of magnitude smaller than the 
energy required to switch the transistor today, and the time to make 
the transisions will be at least ten thousands times faster per step 
of calculation. So there is plenty of room for improvement in the 
computer and I leave you, practical people who work on computers, this 
as an aim to get to. I underestimated how long it would take for Mr. 
Ezawa to translate what I said, and I have no more to say that I have 
prepared for today. Thank you! 


I will answer questions if you'd like. 


Questions and Answers 


Q: You mentioned that one bit of information can be stored in one 
atom, and I wonder if you can store the same amount of information in 
one quark. 

A: Yes. But we don't have control of the quarks and that becomes a 
really impractical way to deal with things. You might think that what 
I am talking about is impractical, but I don't believe so. When I am 
talking about atoms, I believe that someday we will be able to handle 
and control them individually. There would be so much energy involved 
in the quark interactions it would be very dangerous to handle because 
of the radioactivity and so on. But the atomic energies that I am 
talking about are very familiar to us in chemical energies, electrical 
energies, and those, that I am speaking of, are numbers that are 
within the realm of reality, I believe, however absurd it may seem at 
the moment. 


Q: You said that the smaller the computing element is the better. 
But, I think equipments have to be larger, because.... 

A: You mean that your finger is too big to push the buttons? Is 
that what you mean? 

Q: Yes, it is. 

A: Of course, you are right. I am talking about internal computers 
perhaps for robots or other devices. The input and output is 
something that I didn't discuss, whether the input comes from looking 
at pictures, hearing voices, or buttons being pushed. I am discussing 
how the computation is done in principle, and not what form the output 
should take. It is certainly true that the input and the output 
cannot be reduced in most cases effectively beyond human dimension. 
It is already too difficult to push the buttons on some of the 
computers with our big fingers. But with elaborate computing problems 
that take hours and hours, they could be done very rapidly on the very 


small machines with low energy consumption. That's the kind of 
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machine I was thinking of. Not the simple applications of adding two 
numbers but the elaborate calculations. 


Q: I would like to know your method to transform the information 
from one atomic scale element to another atomic scale element. If you 
will use a quantum mechanical or natural interaction between the two 
elements then such a device will become very close to Nature itself. 
For example, if we make a computer simulation, a Monte Carlo 
simulation of a magnet to study critical phenomena, then your atomic 
scale computer will be very close to the magnet itself. What are your 
thoughts about that? 

A: Yes. All things that we make are Nature. We arrange it in a way 
to suit our purpose, to make a calculation for a purpose. Ina magnet 
there is some kind of relation, if you wish, there are some kind of 
computations going on just like there is in the solar system in a way 
of thinking. But, that might not be the calculation we want to make 
at the moment. What we need to make is a device for which we can 
change the programs and let it compute the problem that we want to 
solve, not just its own magnet problem that it likes to solve for 
itself. I can't use the solar system for a computer unless it just 
happens that the problem that someone gave me was to find the motion 
of the planets, in which case all I have to do is to watch. 

There was an amusing article as a joke. Far in the future the 
“article” appears discussing a new method of making aerodynamical 
calculations: Instead of using the elaborate computers of the day, the 
author invents a simple device to blow air past the wing. (He 


reinvents the wind tunnel.) 


Q: I have recently read in an newspaper article that operations of 
the nerve system in a brain are much slower than present day computers 
and the unit in the nerve system is much smaller. Do you think that 
the computers you have talked about today have something in common 
with the nerve system in the brain? 

A: There is an analogy between the brain and the computer in that 


there are apparently elements that can switch under the control of 


others. Nerve impulses controling or exciting other nerves, in a way 

that often depends upon whether more than one impulse comes in; 
something like an AND or its generalization. The amount of energy 
used in the brain cell for one of these transitions? I don't know the 
number. The time it takes to make a switching in the brain is very 
much longer than it is in our computers even today, never mind the 
fancy business of some atomic computer. But, interconnection system 
is much more elaborate. Each nerve is connected to thousand other 
nerves, whereas we connect transistors to two or three others. 

Some people look at the activity of the brain in action and see 
that in many respects it surpasses the computer of today, and in many 
other respects the computer surpasses ourselves. This inspires people 
to design machines that can do more. What often happens is that an 
engineer makes up how the brain works in his opinion, and then designs 
a machine that behaves that way. This new machine may in fact work 
very well. But, I must warn you that that does not tell us anything 
about how the brain actually works, nor is it necessary to ever really 
know that in order to make a computer very capable. It is not 
necessary to understand the way birds flap their wings and how the 
feathers are designed in order to make a flying machine. It is not 
necessary to understand the lever system in the legs of a cheetah, 
that is an animal that runs fast, in order to make an automobile with 
wheels that goes very fast. It is therefore not necessary to imitate 
the behavior of Nature in detail in order to engineer a device which 
can in many respects surpass Nature's abilities. 

It is an interesting subject and I like to talk about it. Your 
brain is very weak compared to a computer. I will give you a series 
of numbers, one, three, seven, oh yes, ichi, san, shichi, san, ni, go, 
ni, go, ichi, hachi, ichi, ni, ku, san, go. I want you to repeat them 
back. But, a computer can take ten thousands numbers and give 
me them back in reverse every other one, or sum them or lots of things 
that we cannot do. On the other hand, if I look at a face, ina 
glance I can tell you who it is if I know that person, or that I don't 
know that person. But, we do not know how to make a computer system 


so that if we give it a pattern of a face it can tell us who he is, 
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even if it has seen many faces and you try to teach it. We do not know 
how to make computers do that, yet. 

Another interesting example is chess playing machines. It is 
quite a surprise that we can make machines that play chess better than 
almost everybody in the room. But, they do it by trying many many 
possibilities. If he moves here, then I could move here and he can 
move there and so forth. They look at each alternative and choose the 
best. Now, millions of alternatives are looked at. But, a master 
chess player, a human, does it differently. He recognizes patterns. 
He looks at only thirty or forty positions before deciding what move 
to make. Therefore, although the rules are simpler in Go, machines 
that play Go are not very good, because in each position there are too 
many possibilities to move and there are too many things to check and 
the machines cannot look deeply. Therefore the problem of recognizing 
patterns and what to do under the circumstances is the thing that the 
computer engineers (they like to call themselves computer scientists) 
still find very difficult, and it is certainly one of the important 
things for future computers, perhaps more important than the things I 
spoke about. Make a machine to play Go effectively. 


Q: I think that any method of computation would not be fruitful 
unless it would give a kind of provision on how to compose’ such 
devices or programs. I thought the Fredkin paper on conservative 
logic was very intriguing, but once I came to think of making a simple 
program using such devices I came to a halt because thinking out such 
a program is far more complex than the program itself. I think we 
could easily get into a kind of infinite regression because the 
process of making out a certain program would be much more complex 
than the program itself and in trying to automate the process the 
automating program would be more complex and so on. Especially in 
this case where the program is hard wired rather than being separated 
as a software, I think it is fundamental to think of the ways of 
composition. 


A: We have some different experiences. There is no infinite 


regression; it stops at a certain level of complexity. The machine 
that Fredkin ultimately is talking about and the one that I was 
talking about in the quantum mechanical case are both universal 
computers in the sense that they can be programed to do various jobs; 
this is not a hard-wired program; they are no more hard-wired than an 
ordinary computer that you can put information in, that the program is 
a part of the input, and the machine does the problem that it is 
assigned to do. It is hard-wired but it is universal like an ordinary 
computer. These things are very uncertain but I found a minimum. If 
you have a program written for an irreversible machine, the ordinary 
program, then I can convert it to a reversible machine program by a 
direct translation scheme, which is very inefficient and uses many 
more steps. Then in real situations, the number of steps can be much 
less. But at least I know that I can take a program with a oe steps 
where it is irreversible, convert it to ie steps of a_ reversible 
machine. That is many more steps. I did it very inefficiently; I did 
not try to find the minimum. Just a way. I don't really think that 
we'll find this regression that you speak of, but you might be right. 
I am uncertain. 


Q: Won't we be sacrificing many of the merits we were expecting of 
such devices, because those reversible machines run so slow? I am very 
pessimistic about this point. 

A: They run slower, but they are very much smaller. I don't make it 
reversible unless I need to. There is no point in making the machine 
reversible unless you are trying very hard to decrease the energy 
enormously, rather ridiculously, because with only 80 times kT the 
irreversible machine functions perfectly. That 80 is much less than 
the present day 10° or 167%: so I have at least 10! improvement in 
energy to make, and can still do it with irreversible machines! 
That's true. That's the right way to go, for the present. I 
entertain myself intellectually for fun, to ask how far could we go in 
principle, not in practice, and then I discover that I can go to a 
fraction of a kT of energy and make the machines microscopic, 


atomically microscopic. But to do so, I must use the reversible 
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physical laws. Irreversibility comes because the heat is spread over 
a large number of atoms and can't be gathered back again. When I make 
the machine very small, unless I allow a cooling element which is lots 
of atoms, I have to work reversibly. In practice there probably will 
never come a time when we will be unwilling to tie a little computer 
to a big piece of lead which contains rol? atoms (which is still very 
small indeed), making it effectively irreversible. Therefore I agree 
with you that in practice, for a very long time and perhaps forever, 
we will use irreversible gates. On the other hand it is a part of the 
adventure of science to try to find a limitations in all directions 
and to stretch a human imagination as far as possible everywhere. 
Although at every stage it has looked as if such an activity was 
absurd and useless, it often turns out at least not to be useless. 


Q: Are there any limitations from the uncertainty principle? Are 
there any fundamental limitations on the energy and the clock time in 
your reversible machine scheme? 

A: That was my exact point. There is no further limitation due to 
quantum mechanics. One must distinguish carefully between the energy 
lost or consumed irreversibly, the heat generated in the operation of 
the machine, and the energy content of the moving parts which might be 
extracted again. There is a relationship between the time and the 
energy which might be extracted again. But that energy which can be 
extracted again is not of any importance or concern. It would be like 
asking whether we should add the nc?, rest energy, of all the atoms 
which are in the device. I only speak of the energy lost times the 
time, and then there is no limitation. However it is true that if you 
want to make a calculation at a certain extremely high speed, you have 
to supply to the machine parts which move fast and have energy but 
that energy is not necessarily lost at each step of the calculation; 
it coasts through by inertia. 


A (to no Q): Could I just say with regard to the question of useless 
ideas? I'd like to add one more. I waited, if you would ask me, but 


you didn't. So I answer it anyway. How would we make a machine of 


such small dimension where we have to put the atoms in special places? 
Today we have no machinery with moving parts whose dimension is 
extremely small or atomic or hundreds of atoms even, but there is no 
physical limitation in that direction either. And there is no reason 
why, when we lay down the silicon even today, the pieces cannot be 
made into little islands so that they are movable. And we could 
arrange small jets so we could squirt the different chemicals on 
certain locations. We can make machinery which is extremely small. 
Such machinery will be easy to control by the same kind of computer 
circuits that we make. Ultimately, for fun again and intellectual 
pleasure, we could imagine machines tiny like few microns across with 
wheels and cables all interconnected by wires, silicon connections, so 
that the thing as a whole, a very large device, moves not like the 
awkward motion of our present stiff machines but in a smooth way of 
the neck of a swan, which after all is a lot of little machines, the 
cells all interconnected and all controlled in a smooth way. Why 
can't we do that ourselves? 
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The physical limitations, due to quantum mechanics, on the functioning of com- 
puters are analyzed. 


1. INTRODUCTION 


This work is a part of an effort to analyze the physical limitations of com- 
puters due to the laws of physics. For example, Bennett’) has made a 
careful study of the free energy dissipation that must accompany com- 
putation. He found it to be virtually zero. He suggested to me the question 
of the limitations due to quantum mechanics and the uncertainty principle. 
I have found that, aside from the obvious limitation to size if the working 
parts are to be made of atoms, there is no fundamental limit from these 
sources either. 

We are here considering ideal machines; the effects of small imperfec- 
tions will be considered later. This study is one of principle; our aim is to 
exhibit some Hamiltonian for a system which could serve as a computer. 
We are not concerned with whether we have the most efficient system, nor 
how we could best implement it. 

Since the laws of quantum physics are reversible in time, we shall have 
to consider computing engines which obey such reversible laws. This 
problem already occurred to Bennett," and to Fredkin and Toffoll,® and 
a great deal of thought has been given to it. Since it may not be familiar to 


' Editor's note: This article, which is based on the author’s plenary talk presented at the 
CLEQ/IQEC Meeting in 1984, originally appeared in the February 1985 issue of Optics 
News. It is here reprinted with kind permission of Professor Feynman and Optics News. 

? Department of Physics, California Institute of Technology, Pasadena, California 91125. 
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you here, I shall review this, and in doing so, take the opportunity to 
review, very briefly, the conclusions of Bennett,’ for we shall confirm them 
all when we analyze our quantum system. 

It is a result of computer science that a universal computer can be 
made by a suitably complex network of interconnected primitive elements. 
Following the usual classical analysis we can imagine the interconnections 
to be ideal wires carrying one of two standard voltages representing the 
local | and 0. We can take the primitive elements to be just two, NOT and 
AND (actually just the one element NAND=NOT AND suffices, for if 
one input is set at 1 the output is the NOT of the other input). They are 
symbolized in Fig. 1, with the logical values resulting on the outgoing 
wires, resulting from different combinations of input wires. 

From a logical point of view, we must consider the wires in detail, for 
in other systems, and our quantum system in particular, we may not have 
wires as such. We see we really have two more logical primitives, FAN 
OUT when two wires are connected to one, and EXCHANGE, when wires 
are crossed. In the usual computer the NOT and NAND primitives are 
implemented by transistors, possibly as in Fig. 2. 

What is the minimum free energy that must be expended to operate an 
ideal computer made of such primitives? Since, for example, when the 
AND operates the output line, c’ is being determined to be one of two 
values, no matter what it was before, the entropy change is In 2 units. This 
represents a heat generation of kT In 2 at temperature 7. For many years it 
was thought that this represented an absolute minimum to the quantity of 
heat per primitive step that had to be dissipated in making a calculation. 

The question is academic at this time. In actual machines we are quite 
concerned with the heat dissipation question, but the transistor system 
used actually dissipates about 10'°k7! As Bennett? has pointed out, this 
arises because to change a wire’s voltage we dump it to ground through a 
resistance; and to build it up again we feed charge, again through a 
resistance, to the wire. It could be greatly reduced if energy could be stored 
in an inductance, or other reactive element. 


NOT AND FAN OUT EXCHANGE 


a 
o—{ 
a 


Primitive elements. 
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NOT NAND 
+V +V 


Fig. 2. Transistor circuits for NOT and NAND. 


However, it is apparently very difficult to make inductive elements on 
silicon wafers with present techniques. Even Nature, in her DNA copying 
machine, dissipates about 100kT per bit copied. Being, at present, so very 
far from this kT In 2 figure, it seems ridiculous to argue that even this is 
too high and the minimum is really essentially zero. But, we are going to be 
even more ridiculous later and consider bits written on one atom instead of 
the present 10'' atoms. Such nonsense is very entertaining to professors 
like me. I hope you will find it interesting and entertaining also. 

What Bennett pointed out was that this former limit was wrong 
because it is not necessary to use irreversible primitives. Calculations can 
be done with reversible machines containing only reversible primitives. If 
this is done the minimum free energy required is independent of the com- 
plexity or number of logical steps in the calculation. If anything, it is kT 
per bit of the output answer. 

But even this, which might be considered the free energy needed to 
clear the computer for further use, might also be considered as part of what 
you are going to do with the answer—the information in the result if you 
transmit it to another point. This is a limit only achieved ideally if you 
compute with a reversible computer at infinitesimal speed. 


2. COMPUTATION WITH A REVERSIBLE MACHINE 


We will now describe three reversible primitives that could be used to 
make a universal machine (Toffoli‘*’). The first is the NOT which evidently 
loses no information, and is reversible, being reversed by acting again with 
NOT. Because the conventional symbol is not symmetrical we shall use an 
X on the wire instead (see Fig. 3a). 
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(a) NOT (b) CONTROLLED NOT 


a a 
FAN OUT 
r) a 
a a’ 
EXCHANGE , 
b b 


(c) CONTROLLED CONTROLLED NOT 


a a’ 
b b See Table |. 
c c’ 


Fig. 3. Reversible primitives. 


Next is what we shall call the CONTROLLED NOT (see Fig. 3b). 
There are two entering lines, a and b, and two exiting lines, a’ and b’. The 
a’ is always the same as a, which is the control line. If the control is 
activated a= 1 then the out b’ is the NOT of b. Otherwise 5 is unchanged, 
b’=b. The table of values for input and output is given in Fig. 3. The 
action is reversed by simply repeating it. 

The quantity 5’ is really a symmetric function of a and b called XOR, 
the exclusive or; a or b but not both. It is likewise the sum modulo 2 of a 
and b, and can be used to compare a and 5, giving a | as a signal that they 
are different. Please notice that this function XOR is itself not reversible. 
For example, if the value is zero we cannot tell whether it came from 
(a, b) = (0, 0) or from (1, 1) but we keep the other line a’ = a to resolve the 
ambiguity. 

We will represent the CONTROLLED NOT by putting a 0 on the 
control wire, connected with a vertical line to an X on the wire which is 
controlled. 

This element can also supply us with FAN OUT, for if b=0 we see 
that a is copied onto line 5’. This COPY function will be important later 
on. It also supplies us with EXCHANGE, for three of them used 


a a 
b SUM 
fo) CARRY 


Fig. 4. Adder. 
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a : 723020! 
b ---b=b 
c SUM = c’ 
d=o0 CARRY =d’ 


Fig. 5. Full adder. 


successively on a pair of lines, but with alternate choice for control line, 
accomplishes an exchange of the information on the lines (Fig. 3b). 

It turns out that combinations of just these two elements alone 
are insufficient to accomplish arbitrary logical functions. Some element 
involving three lines is necessary. We have chosen what we can call the 
CONTROLLED CONTROLLED NOT. Here (see Fig. 3c) we have two 
control lines a, b, which appear unchanged in the output and which change 
the third line c to NOT c only if both lines are activated (a=1 and b=1). 
Otherwise c’=c. If the third line input c is set to 0, then evidently it 
becomes 1(c’=1) only if both a and } are | and therefore supplies us with 
the AND function (see Table I). 

Three combinations for (a, b), namely (0, 0), (0, 1), and (1, 0), all give 
the same value, 0, to the AND (a, 5) function so the ambiguity requires 
two bits to resolve it. These are kept in the lines a, b in the output so the 
function can be reversed (by itself, in fact). The AND function is the carry 
bit for the sum of a and 5. 

From these elements it is known that any logical circuit can be put 
together by using them in combination, and in fact, computer science 


Table I. 
a b c a sb <6" 
0 oOo 0 0 oOo 0 
0 0 1 0 oO 1 
0 1 0 0 1 0 
0 1 1 0 1 1 
1 Oo 0 1 0 0 
1 0 1 I Oo 1 
1 1 0 1 1 1 
1 1 1 1 1 0 
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shows that a universal computer can be made. We will illustrate this by a 
little example. First, of course, as you see in Fig. 4, we can make an adder, 
by first using the CONTROLLED CONTROLLED NOT and then the 
CONTROLLED NOT in succession, to produce from a and b and 0, as 
input lines, the original a on one line, the sum on the second line, and the 
carry on the third. 

A more elaborate circuit is a full adder (see Fig. 5), which takes a 
carry c (from some previous addition) and adds it to the two lines a and b 
and has an additional line d with a 0 input. It requires four primitive 
elements to be put together. Besides this total sum, the total of the three, 
a, b, and c and the carry, we obtain on the other two lines two pieces of 
information. One is the a that we started with, and the other is some inter- 
mediary quantity that we calculated on route. 

This is typical of these reversible systems; they produce not only what 
you want in output, but also a certain amount of garbage. In this par- 
ticular case, and as it turns out in all cases, the garbage can be arranged to 
be, in fact, just the input, if we would just add the extra CONTROLLED 
NOT on the first two lines, as indicated by the dotted lines in Fig. 5; we 
see that the garbage would become a and 5b, which were the inputs of at 
least two of the lines. (We know this circuit can be simplified but we do it 
this way for illustrative purposes.) 

In this way, we can by various combinations produce a general logic 
unit that transforms n bits to n bits in a reversible manner. If the problem 
you are trying to do is itself reversible, then there might be no extra gar- 
bage, but in general, there are some extra lines needed to store up the 
information which you would need to be able to reverse the operation. In 
other words, we can make any function that the conventional system can, 
plus garbage. The garbage contains the information you need to reverse the 
process. 

And how much garbage? It turns out in general that if the output data 
that you are looking for has & bits, then starting with an input and k bits 
containing 0, we can produce, as a result, just the input and the output and 
no further garbage. This is reversible because knowing the output and the 
input permits you, of course, to undo everything. This proposition is 
always reversible. The argument for this is illustrated in Fig. 6. 

Suppose we began with a machine M, which, starting with an input, 
and some large number of 0’s, produces the desired outut plus a certain 
amount of extra data which we call garbage. Now we have seen that the 
copy operation which can be done by a sequence of CONTROLLED 
NOT’s is possible, so if we have originally an empty register, with the & bits 
ready for the output, we can, after the processor M has operated, copy the 
output from the M onto this new register. 
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Fig. 6. Clearing garbage. 


After that, we can build the opposite machine, the M in reverse, the 
reverse machine, which would take this output of M and garbage and turn 
it into the input and 0’s. Thus, seen as an overall machine, we would have 
started with the k 0’s of the register for the output, and the input, and 
ended up with those k 0’s occupied by the output data, and repeat the inut 
as a final product. The number of 0’s that was originally needed in the M 
machine in order to hold the garbage is restored again to 0, and can be 
considered as internal wires inside the new complete machine (M, M and 
copy). 

Overall, then, we have accomplished what we set out to do, and 
therefore garbage need never be any greater than a repetition of the input 
data. 


3. A QUANTUM MECHANICAL COMPUTER 


We now go on to consider how such a computer can also be built 
using the laws of quantum mechanics. We are going to write a 
Hamiltonian, for a system of interacting parts, which will behave in the 
same way as a large system in serving as a universal computer. Of course 
the large system also obeys quantum mechanics, but it is in interaction 
with the heat baths and other things that could make it effectively irrever- 
sible. 


What we would like to do is make the computer as small and as 
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simple as possible. Our Hamiltonian will describe in detail all the internal 
computing actions, but not, of course, those interactions with the exterior 
involved in entering the input (preparing the initial state) and reading the 
Output. 

How small can such a computer be? How small, for instance, can a 
number be? Of course a number can be represented by bits of 1’s and 0’s. 
What we are going to do is imagine that we have two-state systems, which 
we will call “atoms.” An » bit number is then represented by a state of a 
“register,” a set of n two-state systems. 

Depending upon whether or not each atom is in one or another of its 
two states, which we call |1> and |Q>, we can of course, represent any 
number. And the number can be read out of such a register by determining, 
or measuring, in which state each of the atoms are at a given moment. 
Therefore one bit will be represented by a single atom being in one of two 
states, the states we will call |1> and |Q>. 

What we will have to do then can be understood by considering an 
example; the example of a CONTROLLED CONTROLLED NOT. Let G 
be some sort of an operation on three atoms a, b, and c, which converts 
the original state of a, b, and c into a nex appropriate state, a‘, b’,c’, so 
that the connection between a’, b’, and c’ and a,b,c, are just what we 
would have expected if a, b, and c represented wires, and the a’, b’, and c’ 
were the output wires of a CONTROLLED CONTROLLED NOT. 

It must be appreciated here that at the moment we are not trying to 
move the data from one position to another; we are just going to change it. 
Unlike the situation in the actual wired computer in which the voltages on 
one wire then go over to voltages on another, what we are specifically 
making is something simpler, that the three atoms are in some particular 
state, and that an operation is performed, which changes the state to new 
values, a’, b’, c’. 

What we would have then is that the state, in the mathematical form 
|a’, b’, c’>, is simply some operation G operating on |a, b, c>. In quantum 
mechanics, state changing operators are linear operators, and so we’ll sup- 
pose that G is linear. Therefore, G is a matrix, and the matrix elements of 
G, Ga.6'’,a,b,c are all 0 except those in Table I, which are of course 1. 

This table is the same table that represents the truth value table for the 
CONTROLLED CONTROLLED NOT. It is apparent that the operation 
is reversible, and that can be represented by saving that G*G = 1, where the 
* means Hermitian adjoint. That is to say, G is a unitary matrix, (In fact G 
is also a real matrix G* =G, but that’s only a special case.) To be more 
specific, we are going to write A,, . for this special G. We shall use the same 
matrix A with different numbers of subscripts to represent the other 
primitive elements. 
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To take a simple example, the NOT, which would be represented by 
A,, is the simple matrix 


This is a 2x2 matrix and can be represented in many ways, in different 
notations, but the particular one we will use to define these is by the 
method of creation and annihilation operators. Consider operating in this 
case, on a single line a. In order to save alphabets, let us call a@ the matrix 


0 1 
a= 
-— [0 0 
which annihilates the 1 on atom a and converts it to 0; a is an operator 
which converts the state |1> to |O>. But, if the state of the atom were 
originally |O>, the operator a produces the number 0. That is, it doesn’t 


change the state, it simply produces the numerical value zero when 
operating on that state. The conjugate of this thing, of course, is 


0 0 
+ — 
=| i 


which creates, in the sense that operating on the 0 state, it turns it to the | 
state. In other words, it moves from |0> to |1>. When operating on the |1> 
state there is no further state above that which you can create, and 
therefore it gives it the number zero. Every other operator 2 x 2 matrix can 
be represented in terms of these a and a*. For example, the product a*a is 


equal to the matrix 
sees 1 0 
==" [0.0 


which you might call V,. It is 1 when the state is |! > and 0 when the state 
is |O>. It gives the number that the state of the atom represents. Likewise 


the product 
fae 0 0 
— [0 1 


is |— N,, and gives 0 for the up-state and | for the down-state. We'll use | 
to represent the diagonal matrix, 


As a consequence of all this, aa* + a*a=1. 
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It is evident then that our matrix for NOT, the operator that produces 
NOT, is 4, =a+a* and further of course, that’s reversible, 4,*A,=1, A, 
is unitary. 

In the same way the matrix A,, for the CONTROLLED NOT can be 
worked out. If you look at the table of values for CONTROLLED NOT 
you see that it can be written this way: 


a*a(b+b*)+aa* 


In the first term, the a*a selects the condition that the line a=1, in which 
case we want 5+5* the NOT to apply to b. The second term selects the 
condition that the line a is 0, in which case we want nothing to happen to 5 
and the unit matrix on the operators of 5 is implied. This can also be writ- 
ten as | +a*a(b+b* —1), the 1 representing all the lines coming through 
directly, but in the case that a is 1, we would like to correct that by putting 
in a NOT instead of leaving the line b unchanged. 
The matrix for the CONTROLLED CONTROLLED NOT is 


A abc= { +a*ab*b(c+c* i 1) 


as, perhaps, you may be able to see. 

The next question is what the matrix is for a general logic unit which 
consists of a sequence of these. As an example, we’ll study the case of the 
full adder which we described before (see Fig. 5). Now we'll have, in the 
general case, four wires represented by a, b,c, and d; we don’t necessarily 
have to have d as 0 in all cases, and we would like to describe how the 
object operates in general (if dis changed to | d’ is changed to its NOT). It 
produces new numbers a’, 5’, c’, and d’, and we could imagine with our 
system that there are four atoms labeled a,b,c,d in a state labeled 
la, b,c, d> and that a matrix M operates which changes these same four 
atoms so that they appear to be in the state |a’,b’,c’,d'> which is 
appropriate for this logic unit. That is, if |W,,> represents the incoming 
state of the four bits, M is a matrix which generates an outgoing state 
lWour> = M|W;,> for the four bits. 

For example, if the input state were the state |1,0, 1,0>, then, as we 
know, the output state should be |1, 0,0, 1); the first two a’, b’ should be 
1, 0 for those two first lines come streight through, and the last two c’, d’ 
should be 0, 1 because that represents the sum and carry of the first three, 
a, b, c, bits in the first input, as d=0. Now the matrix M/ for the adder can 
easily be seen as the result of five successive primitive operations, and 
therefore becomes the matrix product of the five successive matrices 
representing these primitive objects. 


M ie A apApcApe,a4abA abd 
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The first, which is the one written farthest to the right, is A,,4 for that 
represents the CONTROLLED CONTROLLED NOT in which a and b 
are the CONTROL lines, and the NOT appears on line d. By looking at 
the diagram in Fig. 5 we can immediately see what the remaining factors in 
the sequence represent. The last factor, for example, A,,, means that 
there’s a CONTROLLED NOT with a CONTROL on line a and NOT on 
line b. This matrix will have the unitary property /*M =1 since all of the 
A’s out of which it is a product are unitary. That is to say, M is a reversal 
operation, and M* is its inverse. 

Our general problem, then, is this. Let A,, A,, A3,,,, A, be the suc- 
cession of operations wanted, in some logical unit, to operate on n lines. 
The 2”x 2” matrix M needed to accomplish the same goal is a product 
A,°**A,A,A,, where each A is a simple matrix. How can we generate this 
M in a physical way if we know how to make the simpler elements? 

In general, in quantum mechanics, the outgoing state at time 1 is 
e“'y,,, where w;, is the input state, for a system with Hamiltonian H. To 
try to find, for a given special time 4, the Hamiltonian which will produce 
M=e'"' when M is such a product of noncommuting matrices, from some 
simple property of the matrices themselves, appears to be very difficult. 

We realize however, that at any particular time, if we expand the e'”' 
out (as 1+iHt—H?*1?/2—---) we'll find the operator H operating an 
innumerable arbitrary number of times, once, twice, three times, and so 
forth, and the total state is generated by a superposition of these 
possibilities. This suggests that we can solve this problem of the com- 
position of these A’s in the following way. 

We add to the ” atoms, which are in our register, an entirely new set 
of k+1 atoms, which we’ll call “program counter sites.” Let us call g, and 
q* the annihilation and creation operators for the program site i for i=0 
to k. A good thing to think of, as an example, is an electron moving from 
one empty site to another. If the site ¢ is occupied by the electron, its state 
is |1>, while if the site is empty, its state is |O>. 

We write, as our Hamiltonian 


k—1 
H= ¥ q¥,19;4;41+complex conjugate 
i=0 
=qiqoAitqiqArtqtqAst 9 +qdsqi At 
t+qiqAt+qiqsAFto 


The first thing to notice is that if all the program sites are unoccupied, 
that is, all the program atoms are initially in the state 0, nothing happens 
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because every term in the Hamiltonian starts with an annihilation operator 
and it gives O therefore. 

The second thing we notice is that if only one or another of the 
program sites is occupied (in state |1>), and the rest are not (state |O>), 
then this is always true. In fact the number of program sites that are in 
state |1> is a conserved quantity. We will suppose that in the operation of 
this computer, either no sites are occupied (in which case nothing happens) 
or just one site is occupied. Two or more program sites are never both 
occupied during normal operation. 

Let us start with an initial state where site 0 is occupied, is in the |1)> 
state, and all the others are empty, |0> state. If later, at some time, the final 
site k is found to be in the |1> state, (and therefore all the others in |0>) 
then, we claim, the n register has been multiplied by the matrix M, which is 
A,°**A,A, as desired. 

Let me explain how this works. Suppose that the register starts in any 
initial state, y;,, and that the site, 0, of the program counter is occupied. 
Then the only term in the entire Hamiltonian that can first operate, as the 
Hamiltonian operates in successive times, is the first term, g*q )A,. The qo 
will change site number 0 to an unoccupied site, while g* will change the 
site number 0 to an occupied site. Thus the term g*q, is a term which 
simply moves the occupied site from the location 0 to the location 1. But 
this is multiplied by the matrix A, which operates only on the n register 
atoms, and therefore multiplies the initial state of the n register atoms by 
A. 

Now, if the Hamiltonian happens to operate a second time, this first 
term will produce nothing because g, produces 0 on the number 0 site 
because it is now unoccupied. The term which can operate now is the 
second term, g*q, A>, for that can move the occupied point, which I shall 
call a “cursor.” The cursor can move from site | to site 2 but the matrix A, 
now operates on the register; therefore the register has now got the matrix 
A,A, operating on it. 

So, looking at the first line of the Hamiltonian, if that is all there was 
to it, as the Hamiltonian operates in successive orders, the cursor would 
move successively from 0 to k, and you would acquire, one after the other, 
operating on the ” register atoms, the matrices, A, in the order that we 
would like to construct the total M. 

However, a Hamiltonian must be hermitian, and therefore the com- 
plex conjugate of all these operators must be present. Suppose that at a 
given stage, we have gotten the cursor on site number 2, and we have the 
matrix A,A, operating on the register. Now the g, which intends to move 
that occupation to a new position need not come from the first line, but 
may have come from the second line. It may have come, in fact, from 
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q*q,A}* which would move the cursor back from the position 2 to the 
position 1. 

But note that when this happens, the operator A¥ operates on the 
register, and therefore the total operator on the register is 4* A, A, in this 
case. But A¥ A, is 1 and therefore the operator is just A,. Thus we see that 
when the cursor is returned to the position 1, the net result is that only the 
operator A, has really operated on the register. Thus it is that as the 
various terms of the Hamiltonian move the cursor forwards and 
backwards, the A’s accumulate, or are reduced out again. 

At any stage, for example, if the cursor were up to the j site, the 
matrices from A, to A, have operated in succession on the n register. It 
does not matter whether or not the cursor on the j site has arrived there, 
by going directly from 0 to j, or going further and returning, or going back 
and forth in any pattern whatsoever, as long as it finally arrived at the 
state j. 

Therefore it is true that if the cursor is found at the site k, we have the 
net result for the n register atoms that the matrix M has operated on their 
initial state as we desired. 

How then could we operate this computer? We begin by putting the 
input bits onto the register, and by putting the cursor to occupy the site 0. 
We then check at the site k, say, by scattering electrons, that the site k is 
empty, or that the site k has a cursor. The moment we find the cursor at 
site k we remove the cursor so that it cannot return down the program line, 
and then we know that the register contains the output data. We can then 
measure it at our leisure. Of course, there are external things involved in 
making the measurements, and determining all of this, which are not part 
of our computer. Surely a computer has eventually to be in interaction with 
the external world, both for putting data in and for taking it out. 

Mathematically it turns out that the propagation of the cursor up and 
down this program line is exactly the same as it would be if the operators 
A were not in the Hamiltonian. In other words, it represents just the waves 
which are familiar from the propagation of the tight binding electrons or 
spin waves in one dimension, and are very well known. There are waves 
that travel up and down the line and you can have packets of waves and so 
forth. 

We could improve the action of this computer and make it into a 
ballistic action in the following way: by making a line of sites in addition to 
the ones inside, that we are actually using for computing, a line of say, 
many sites, both before and after. It’s just as though we had values of the 
index i for g,, which are less than O and greater than k, each of which has 
no matrix A, just a 1 multiplying there. Then we had have a longer spin 
chain, and we could have started, instead of putting a cursor exactly at the 
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beginning site 0, by putting the cursor with different amplitudes on dif- 
ferent sites representing an initial incoming spin wave, a wide packet of 
nearly definite momentum. 

This spin wave would then go through the entire computer in a 
ballistic fashion and out the other end into the outside tail that we have 
added to the line of program sites, and there it would be easier to deter- 
mine if it is present and to steer it away to some other place, and to cap- 
ture the cursor. Thus, the logical unit can act in a ballistic way. 

This is the essential point and indicates, at least to a computer scien- 
tist, that we could make a universal computer, because he knows if we can 
make any logical unit we can make a universal computer. That this could 
represent a universal computer for which composition of elements and 
branching can be done is not entirely obvious unless you have some 
experience, but I will discuss that to some further extent later. 


4. IMPERFECTIONS AND IRREVERSIBLE FREE ENERGY LOSS 


There are, however, a number of questions that we would like to 
discuss in more detail such as the question of imperfections. 

There are many sources of imperfections in this machine, but the first 
one we would like to consider is the possibility that the coefficients in the 
couplings, along the program line, are not exactly equal. The line is so long 
that in a real calculation little irregularities would produce a small 
probability of scattering, and the waves would not travel exactly 
ballistically, but would go back and forth. 

If the system, for example, is built so that these sites are built on a 
substrate of ordinary physical atoms, then the thermal vibrations of these 
atoms would change the couplings a little bit and generate imperfections. 
(We should even need such noise for with small fixed imperfections there 
are shallow trapping regions where the cursor may get caught.) Suppose 
then, that there is a certain probability, say p per step of calculation (that 
is, per step of cursor motion, i>i+1), for scattering the cursor momen- 
tum until it is randomized (1/p is the transport mean free path). We will 
suppose that the p is fairly small. 

Then in a very long calculation, it might take a very long time for the 
wave to make its way out the other end, once started at the beginning 
——because it has to go back and forth so many times due to the scattering. 
What one then could do would be to pull the cursor along the program 
line with an external force. If the cursor is, for example, an electron moving 
from one vacant site to another, this would be just like an electric field 
trying to pull the electron along a wire, the resistance of which is generated 
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by the imperfection or the probability of scattering. Under these circum- 
stances we can calculate how much energy will be expended by this 
external force. 

This analysis can be made very simply: it is an almost classical 
analysis of an electron with a mean free path. Every time the cursor is scat- 
tered, I am going to suppose it is randomly scattered forward and 
backward. In order for the machine to operate, of course, it must be 
moving forward at a higher probability than it is moving backward. 
When a scattering occurs therefore, the loss in entropy is the logarithm of 
the probability that the cursor is moving forward, divided by the 
probability the cursor was moving backward. 

This can be approximated by (the probability forward — the 
probability backward)/(the probability forward + the probability 
backward). That was the entropy lost per scattering. More interesting is the 
entropy lost per net calculational step, which is, of course, simply p times 
that number. We can rewrite the entropy cost per calculational step as 


PYD/Vp 


where vp is the drift velocity of the cursor and vz its random velocity. 

Or if you like, it is p times the minimum time that the calculation 
could be done in (that is, if all the steps were always in the forward direc- 
tion), divided by the actual time allowed. 

The free energy loss per step then, is kT x px the minimum time that 
the calculation could be done, divided by the actual time that you allow 
yourself to do it. This is a formula that was first derived by Bennett. The 
factor p is a Coasting factor, to represent situations in which not every site 
scatters the cursor randomly, but it has only a small probability to be thus 
scattered. 

It will be appreciated that the energy loss per step is not kT but is that 
divided by two factors. One, (1/p), measures how perfectly you can build 
the machine and the other is proportional to the length of time that you 
take to do the calculation. It is very much like a Carnot engine, in which in 
order to obtain reversibility, one must operate very slowly. For the ideal 
machine where p is 0, or where you allow an infinite time, the mean energy 
loss can be 0. 

The uncertainty principle, which usually relates some energy and time 
uncertainty, is not directly a limitation. What we have in our computer is a 
device for making a computation, but the time of arrival of the cursor and 
the measurement of the output register at the other end (in other words, 
the time it takes in which to complete the calculation) is not a define time. 
It’s a question of probabilities, and so there is a considerable uncertainty in 
the time at which a calculation will be done. 
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There is no loss associated with the uncertainty of cursor energy; at 
least no loss depending on the number of calculational steps. Of course, if 
you want to do a ballistic calculation on a perfect machine, some energy 
would have to be put into the original wave, but that energy, of course, can 
be removed from the final wave when it comes out of the tail of the 
program line. All questions associated with the uncertainty of operators 
and the irreversibility of measurements are associated with the input and 
output functions. 

No further limitations are generated by the quantum nature of the 
computer per se, nothing that is proportional to the number of com- 
putational steps. 

In a machine such as this, there are very many other problems, due to 
imperfections. For example, in the registers for holding the data, there will 
be problems of cross-talk, interactions between one atom and another in 
that register, or interaction of the atoms in that register directly with things 
that are happening along the program line, that we did not exactly bargain 
for. In other words, there may be small terms in the Hamiltonian besides 
the ones we have written. 

Until we propose a complete implementation of this, it is very difficult 
to analyze. At least some of these problems can be remedied in the usual 
way by techniques such as error correcting codes, and so forth, that have 
been studied in normal computers. But until we find a specific implemen- 
tation for this computer, I do not know how to proceed to analyze these 
effects. However, it appears that they would be very important, in practice. 
This computer seems to be very delicate and these imperfections may 
produce considerable havoc. 

The time needed to make a step of calculation depends on the strength 
or the energy of the interactions in the terms of the Hamiltonian. If each of 
the terms in the Hamiltonian is supposed to be of the order of 0.1 electron 
volts, then it appears that the time for the cursor to make each step, if done 
in a ballistic fashion, is of the order 6 x 10~'° sec. This does not represent 


Cc 
O 
\ q IF c= 1 GO p TOqg AND PUT c=0 
| 


p O IF c=O GO p TOr AND PUT c= | 
Ie, IF ¢=1 GO r TOp AND PUT c=0 
H=q"cp +r*c*p IF ¢=O0 GO q TOp AND PUT c= 1 


+ p*c™q + p*cr 


Fig. 7. Switch. 
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an enormous improvement, perhaps only about four orders of magnitude 
over the present values of the time delays in transistors, and is not much 
shorter than the very short times possible to achieve in many optical 
systems. 


5. SIMPLIFYING THE IMPLEMENTATION 


We have completed the job we set out to do—to find some quantum 
mechanical Hamiltonian of a system that could compute, and that is all 
that we need say. But it is of some interest to deal with some questions 
about simplifying the implementation. The Hamiltonian that we have writ- 
ten involves terms which can involve a special kind of interaction between 
five atoms. For example, three of them in the register, for a CON- 
TROLLED CONTROLLED NOT, and two of them as the two adjacent 
sites in the program counter. 

This may be rather complicated to arrange. The question is, can we do 
it with simpler parts. It turns out that we can indeed. We can do it so that 
in each interaction there are only three atoms. We are going to start with 
new primitive elements, instead of the ones we began with. We’ll have the 
NOT ail right, but we have in addition to that simply a “switch” (see also 
Priese'?). 

Supposing that we have a term, g*cp+r*c*p+its complex conjugate 
in the Hamiltonian (in all cases we'll use letters in the earlier part of the 
alphabet for register atoms and in the latter part of the alphabet for 
program sites). See Fig. 7. This is a switch in the sense that, if c is 
originally in the |1> state, a cursor at p will move to g, whereas if c is in 
the |0> state, the cursor at p will move to r. 

During this operation the controlling atom c changes its state. (It is 
possible also to write an expression in which the control atom does not 
change its state, such as g*c*cp+r*cc*p and its complex conjugate but, 
there is no particular advantage or disadvantage to this, and we will take 
the simpler form.) The complex conjugate reverses this. 

If, however, the cursor is at g and c is in the state |1} (or cursor at r, c 
in |O>), the H gives 0, and the cursor gets reflected back. We shall build all 
our circuits and choose initial states so that this circumstance will not arise 
in normal operation, and the ideal ballistic mode will work. 

With this switch we can do a number of things. For example, we could 
produce a CONTROLLED NOT as in Fig. 8. The switch a controls the 
operation. Assume the cursor starts at s. If a=1 the program cursor is 
carried along the top line, whereas if a=0 it is carried along the bottom 
line, in either case terminating finally in the program site 1. 
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Fig. 8. CONTROLLED NOT by switches. 


In these diagrams, horizontal or vertical lines will represent program 
atoms. The switches are represented by diagonal lines and in boxes we'll 
put the other matrices that operate on registers such as the NOT b. To be 
specific, the Hamiltonian for this little section of a CONTROLLED NOT, 
thinking of it as starting at s and ending at 1, is given below: 


H As, t)=skas+ t*a*ty +1%(b +b*) sy t+spats 
+ t*atytthsyteoc 


(The c.c means to add the complex conjugate of all the previous terms.) 

Although there seem to be two routes here which would possibly 
produce all kinds of complications characteristic of quantum mechanics, 
this is not so. If the entire computer system is started in a definite state for 
a by the time the cursor reaches s, the atom a is still in some definite state 
(although possibly different from its initial state due to previous computer 
operations on it). Thus only one of the two routes is taken. The expression 
may be simplified by omitting the s¥z, term and putting ry =sy. 

One need not be concerned in that case, that one route is longer (two 
cursor sites) than the other (one cursor site) for again there is no inter- 
ference. No scattering is produced in any case by the insertion into a chain 
of coupled sites, an extra piece of chain of any number of sites with the 
same mutual coupling between sites (analogous to matching impedances in 
transmission lines). 

To study these things further, we think of putting pieces together. A 
piece (see Fig. 9) M might be represented as a logical unit of interacting 
parts in which we only represent the first input cursor site as s,, and the 
final one at the other end as f,,. All the rest of the program sites that are 
between s,, and f,, are considered internal parts of M, and M contains its 
registers. Only s,, and f,, are sites that may be coupled externally. 
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Sy = Starting program site for piece 


t,, = Terminal program site for piece 


Hy (Sy>ty is the part of the Hamiltonian 
representing all the "atoms" and program sites 


within the box M, and their interactions with s,,t 
Fig. 9. One “piece.” 


The Hamiltonian for this subsection we'll call H,, and we'll identify s 4, 
and r,,, as the name of the input and output program sites by writing 
Has ty). So therefore H,, is that part of the Hamiltonian representing 
all the atoms in the box and their external start and terminator sites. 

An especially important and interesting case to consider is when the 
input data (in the regular atoms) comes from one logical unit, and we 
would like to transfer it to another (seé Fig. 10). Suppose that we imagine 
that the box M starts with its input register with 0 and its output (which 
may be the same register) also with 0. Then we could use it in the following 
way. We could make a program line, let’s say starting with s, whose first 
job is to exchange the data in an external register which contains the input, 
with M’s input register which at the present time contains 0’s. 

Then the first step in our calculation, starting, say, at s4,, would be to 
make an exchange with the register inside of M. That puts zero’s into the 
original input register and puts the input where it belongs inside the box 
M. The cursor is now at 5,,. (We have already explained how exchange can 
be made of controlled NOTs.) Then as the program goes from sy, to ty, we 
find the output now in the box M. Then the output register of M is now 
Cleared as we write the results into some new external register provided for 
that purpose, originally containing 0’s. This we do from ty to fy by 
exchanging data in the empty external register with the M’s output register. 

We can now consider connecting such units in different ways. For 
example, the most obvious way is succession. If we want to do first M and 
then N we can connect the terminal side of one to the starting side of the 
other as in Fig. 11, to produce a new effective operator K, and the 
Hamiltonian then for H, is 


A x(Sx, tx) =HaylSx, I+ Aylt, tx) 
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Fig. 10. Piece with external input and output, 


The general conditional, if a= 1 do M, but if a=0 do N, can be made, 
as in Fig. 12. For this 


H eond(Ses tc) = (Stas. t+ tk atty +shat*s. + 1% atyt+cc.) 
+ Hal Sas tur) + An (Sy, tw) 


The CONTROLLED NOT is the special case of this with M=NOT5 
for which H is 


Hnors(S, t) =5*(b +b*) t+, 


and N is no operation s*z. 


Hy (Sx, ty) = Hu (Sy, t) + Halt, ty) 


Fig. 11. Operations in succession. 
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Fig. 12. Conditional if a= 1 then M, else N. 


As another example, we can deal with a garbage clearer (previously 
described in Fig. 6) not by making two machines, a machine and its 
inverse, but by using the same machine and then sending the data back to 
the machine in the opposite direction, using our switch (see Fig. 13). 

Suppose in this system we have a special flag which is originally 
always set to 0. We also suppose we have the input data in an external 
register, an empty external register available to hold the output, and the 
machine registers all empty (containing 0’s). We come on the starting 
line s. 

The first thing we do is to copy (using CONTROLLED NOT’s) our 
external input into M, Then M operates, and the cursor goes on the top 
line in our drawing. It copies the output out of M into the external output 
register. Mf now contains garbage. Next it changes fto NOT /, comes down 
on the other line of the switch, backs out through M clearing the garbage, 
and uncopies the input again. 

When you copy data and do it again, you reduce one of the registers 
to 0, the register into which you coied the first time. After the coying, it 
goes out (since f is now changed) on the other line where we restore f to 0 


Fig. 13. Garbage clearer. 
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and come out at #. So between s and ¢ we have a new piece of equipment, 
which has the following properties. 

When its starts, we have, in a register called IN, the input data. In an 
external register which we call OUT, we have 0’s. There is an internal flag 
set at 0, and the box, M, is empty of all data. At the termination of this, at 
t, the input register still contains the input data, and the output register 
contains the output of the effort of the operator M. M, however, is still 
empty, and the flag f is reset to 0. 

Also important in computer programs is the ability to use the same 
subroutine several times. Of course, from a logical point of view, that can 
be done by writing that bit of program over and over again, each time it is 
to be used, but in a practical computer, it is much better if we could build 
that section of the computer which does a particular operation, just once, 
and use that section again and again. 

To show the possibilities, here, first just suppose we have an operation 
we simply wish to repeat twice in succession (see Fig. 14). We start at s 
with the flag a in the condition 0, and thus we come along the line, and the 
first thing that happens is we change the value of a. Next we do the 
operation M. Now, because we changed a, instead of coming out at the top 
line where we went in, we come out at the bottom line, which recirculates 
the program back into changing a again; it restores it. 

This time as we go through M, we come out and we have the a to 
follow on the uper line, and thus come out at the terminal, 1 The 
Hamiltonian for this is 


Hum(S; 1) = (shats + st(a* +a) sytx*a*ty +sKax 
+t*aty tcc.) + Halsays tu) 


Using this switching circuit a number of times, of course, we can 
repeat an operation several times. For example, using the same idea three 


Fig. 14. Do M twice. 
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Fig. 15. Do M eight times. 


times in succession, a nested succession, we can do an operation eight 
times, by the apparatus indicated in Fig. 15. In order to do so, we have 
three flags, a, b, and c. It is necessary to have flags when operations are 
done again for the reason that we must keep track of how many times its 
done and where we are in the program or we'll never be able to reverse 
things. 

A subroutine in a normal computer can be used and emptied and used 
again without any record being kept of what happened. But here we have 
to keep a record and we do that with flags, of exactly where we are in the 
cycle of the use of the subroutine. If the subroutine is called from a certain 
place and has to go back to some other place, and another time is called, 
its origin and final destination are different, we have to know and keep 
track of where it came from and where it’s supposed to go individually in 
each case, so more data have to be kept. Using a subroutine over and over 
in a reversible machine is only slightly harder than in a general machine. 
All these considerations appear in papers by Fredkin, Toffoli, and Bennett. 

It is clear by the use of this switch, and successive uses of such 
switches in trees, that we would be able to steer data to any oint in a 
memory. A memory would simply be a place where there are registers into 
which you could copy data and then return the program. The cursor will 


Fig. 16. Increment counter (3-bit). 
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have to follow the data along. I suppose there must be another set of tree 
switches set the oposite direction to carry the cursor out again, after 
copying the data so that the system remains reversible. 

In Fig. 16 we show an incremental binary counter (of three bits a, b, c 
with c the most significant bit) which keeps track of how many net times 
the cursor has passed from s to r. These few examples should be enough to 
show that indeed we can construct all computer functions with our 
SWITCH and NOT. We need not follows this in more detail. 


6. CONCLUSIONS 


It’s clear from these examples that this quentum machine has not 
really used many of the specific qualities of the differential equations of 
quantum mechanics, 

What we have done is only to try to imitate as closely as possible the 
digital machine of conventional sequential architecture. It is analogous to 
the use of transistors in conventional machines, where we do not properly 
use all the analog continuum of the behavior of transistors, but just try to 
run them as saturated on or off digital devices so the logical analysis of the 
system behavior is easier. Furthermore, the system is absolutely sequen- 
tial—for example, even in the comparison (exclusive or) of two k bit num- 
bers, we must do each bit successively. What can be done, in these rever- 
sible quantum systems, to gain the speed available by concurrent operation 
has not been studied here. 

Although, for theoretical and academic reasons, I have studied com- 
plete and reversible systems, if such tiny machines could become practical 
there is no reason why irreversible and entropy creating interactions cannot 
be made frequently during the course of operations of the machine. 

For example, it might prove wise, in a long calculation, to ensure that 
the cursor has surely reached some oint and cannot be allowed to reverse 
again from there. Or, it may be found practical to connect irreversible 
memory storage (for items less frequently used) to reversible logic or short- 
term reversible storage registers, etc. Again, there is no reason we need to 
stick to chains of coupled sites for more distant communication where 
wires or light may be easier and faster. 

At any rate, it seems that the laws of physics present no barrier to 
reducing the size of computers until bits are the size of atoms, and quan- 
tum behavior holds dominant sway. 
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